export_model guide disagrees with codebase

timgautier commented 9 months ago

The guide shows the usage statement here: https://github.com/huggingface/optimum-neuron/blob/eb2a93fe89721a3977957cb3748832bf9d48e382/docs/source/guides/export_model.mdx#L90

usage: optimum-cli export neuron [-h] -m MODEL [--task TASK] [--atol ATOL] [--cache_dir CACHE_DIR] [--trust-remote-code] [--compiler_workdir COMPILER_WORKDIR] [--disable-validation] [--auto_cast {none,matmul,all}] [--auto_cast_type {bf16,fp16,tf32}] [--dynamic-batch-size] [--num_cores NUM_CORES] [--unet UNET] [--output_hidden_states] [--output_attentions] [--batch_size BATCH_SIZE] [--sequence_length SEQUENCE_LENGTH] [--num_beams NUM_BEAMS] [--num_choices NUM_CHOICES] [--num_channels NUM_CHANNELS] [--width WIDTH] [--height HEIGHT] [--num_images_per_prompt NUM_IMAGES_PER_PROMPT] [-O1 | -O2 | -O3] output

However, running optimum-cli export neuron --help yields this usage statement:

usage: optimum-cli export neuron [-h] -m MODEL [--task TASK] [--library-name {transformers,diffusers,sentence_transformers}] [--subfolder SUBFOLDER] [--atol ATOL] [--cache_dir CACHE_DIR] [--trust-remote-code] [--compiler_workdir COMPILER_WORKDIR] [--disable-validation] [--auto_cast {none,matmul,all}] [--auto_cast_type {bf16,fp16,tf32}] [--dynamic-batch-size] [--unet UNET] [--output_hidden_states] [--output_attentions] [--batch_size BATCH_SIZE] [--sequence_length SEQUENCE_LENGTH] [--num_beams NUM_BEAMS] [--num_choices NUM_CHOICES] [--num_channels NUM_CHANNELS] [--width WIDTH] [--height HEIGHT] [--num_images_per_prompt NUM_IMAGES_PER_PROMPT] [-O1 | -O2 | -O3] output

The code claims it supports --library-name and --subfolder which aren't shown in the guide, but more concerning to me is that the guide claims support for --num_cores which the code doesn't seem to support. How does it decide how many cores if we can't tell it?

dacorvo commented 9 months ago

You might be using an outdated version of optimum-neuron. Try again with optimum-neuron == 0.0.18.

$ optimum-cli export neuron -h                                                                                                                
usage: optimum-cli export neuron [-h] -m MODEL [--task TASK] [--library-name {transformers,diffusers,sentence_transformers}] [--subfolder SUBFOLDER] [--atol ATOL]                   
                                 [--cache_dir CACHE_DIR] [--trust-remote-code] [--compiler_workdir COMPILER_WORKDIR] [--disable-weights-neff-inline] [--disable-validation]          
                                 [--auto_cast {none,matmul,all}] [--auto_cast_type {bf16,fp16,tf32}] [--dynamic-batch-size] [--num_cores NUM_CORES] [--unet UNET]
                                 [--output_hidden_states] [--output_attentions] [--batch_size BATCH_SIZE] [--sequence_length SEQUENCE_LENGTH] [--num_beams NUM_BEAMS]
                                 [--num_choices NUM_CHOICES] [--num_channels NUM_CHANNELS] [--width WIDTH] [--height HEIGHT] [--num_images_per_prompt NUM_IMAGES_PER_PROMPT]
                                 [-O1 | -O2 | -O3]
                                 output

jimburtoft commented 9 months ago

If you are using the HF DLAMI, make sure you

sudo pip install --upgrade optimum-neuron==0.0.18

Otherwise it doesn't overwrite the binary.