huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
207 stars 61 forks source link

optimum-neuron test results with torch-neuronx 2.1 #465

Closed jeffhataws closed 5 months ago

jeffhataws commented 8 months ago

Test procedures:

pip install optimum-neuron
pip install git+https://github.com/huggingface/optimum-neuron#egg=optimum-neuron[tests] tokenizers
git clone https://github.com/huggingface/optimum-neuron.git
cd optimum-neuron/
pytest -m is_inferentia_test tests

Test results are shown at the bottom.

Some of the failed tests that failed with RuntimeError: nrt_load_collectives status=2 message="Invalid" pass when run by itself, for example tests/pipelines/test_decoder_pipelines.py::test_load_no_parameters[hf-internal-testing/tiny-random-gpt2].

Packages tested (alpha):

aws-neuronx-runtime-discovery 2.9
libneuronxla                  2.0.656
neuron-torch-tools            1.0.0.2047+f12740727
neuronx-cc                    2.0.37506.0a0+dedc3e172
optimum-neuron                0.0.18.dev0
pytorch-lightning             1.8.6
pytorchhighlevelperfmodel     1.0.0.650+beb242c34
sentence-transformers         2.2.2
torch                         2.1.2
torch-neuronx                 2.1.1.2.0.1b0
torch-xla                     2.1.2
torchmetrics                  0.10.3
torchvision                   0.16.2
transformers                  4.36.2
transformers-neuronx          0.9.687

Test errors types (uniquified):

E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--dynamic-batch-size', '--model', 'hf-internal-testing/tiny-random-BertModel', '--sequence_length', '
16', '--batch_size', '1', '--task', 'text-classification', '/tmp/tmp79tpjnbz']' returned non-zero exit status 1.                                                                                  
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'echarlaix/tiny-random-stable-diffusion-xl', '--task', 'stable-diffusion-xl', '--batch_size
', '1', '--height', '64', '--width', '64', '--num_images_per_prompt', '4', '--auto_cast', 'matmul', '--auto_cast_type', 'bf16', '/tmp/tmpthudzr94']' returned non-zero exit status 1.             
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'echarlaix/tiny-random-stable-diffusion-xl', '--unet', 'Jingya/tiny-random-sdxl-unet', '--t
ask', 'stable-diffusion-xl', '--batch_size', '1', '--height', '64', '--width', '64', '--num_images_per_prompt', '4', '--auto_cast', 'matmul', '--auto_cast_type', 'bf16', '/tmp/tmpxp1bth8k']' ret
urned non-zero exit status 1.    
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-BertModel', '--sequence_length', '16', '--batch_size', '1'
, '--task', 'text-classification', '--compiler_workdir', '/tmp/tmp2jd85ypq/neff', '/tmp/tmp2jd85ypq']' returned non-zero exit status 1.                                                           
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-BertModel', '--sequence_length', '16', '--batch_size', '1'
, '--task', 'text-classification', '-O1', '/tmp/tmpst90dqhk']' returned non-zero exit status 1.                                                                                                   
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-gpt2', '--sequence_length', '128', '--batch_size', '2', '-
-auto_cast_type', 'bf16', '--num_cores', '1', '--task', 'text-generation', '/tmp/tmpxjp6srhh']' returned non-zero exit status 1.                                                                  
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-gpt2', '--sequence_length', '128', '--batch_size', '2', '-
-auto_cast_type', 'bf16', '--num_cores', '2', '--task', 'text-generation', '/tmp/tmp964uid_u']' returned non-zero exit status 1.                                                                  
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-gpt2', '--sequence_length', '512', '--batch_size', '1', '-
-auto_cast_type', 'fp16', '--num_cores', '1', '--task', 'text-generation', '/tmp/tmpb2rehfbl']' returned non-zero exit status 1.                                                                  
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-gpt2', '--sequence_length', '512', '--batch_size', '1', '-
-auto_cast_type', 'fp16', '--num_cores', '2', '--task', 'text-generation', '/tmp/tmpez5in8cm']' returned non-zero exit status 1.                                                                  
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-t5', '--task', 'text2text-generation', '--batch_size', '1'
, '--sequence_length', '18', '--num_beams', '4', '--auto_cast', 'matmul', '--auto_cast_type', 'bf16', '--output_hidden_states', '--output_attentions', '/tmp/tmp5xgw1g12']' returned non-zero exit
 status 1.                                                                                                                                                                                        
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-random-t5', '--task', 'text2text-generation', '--batch_size', '1'
, '--sequence_length', '18', '--num_beams', '4', '--auto_cast', 'matmul', '--auto_cast_type', 'bf16', '/tmp/tmpyktraao1']' returned non-zero exit status 1.                                       
E               subprocess.CalledProcessError: Command '['optimum-cli', 'export', 'neuron', '--model', 'hf-internal-testing/tiny-stable-diffusion-torch', '--task', 'stable-diffusion', '--batch_s
ize', '1', '--height', '64', '--width', '64', '--num_images_per_prompt', '4', '--auto_cast', 'matmul', '--auto_cast_type', 'bf16', '/tmp/tmpo9iefwyz']' returned non-zero exit status 1.          
E           Forbidden: pass `create_pr=1` as a query parameter to create a Pull Request                                                                                                           
E           RuntimeError: Pretrained model is compiled with neuronx-cc(2.12.54.0+f631c2365) newer than current compiler (2.0.37506.0a0+dedc3e172), which may cause runtime incompatibilities.     
E           RuntimeError: Stable diffusion models are supported from neuronx-cc 2.6, but you have 2.0.37506.0a0+dedc3e172, please upgrade it.                                                     
E           RuntimeError: nrt_load_collectives status=2 message="Invalid"                                                                                                                         
E           huggingface_hub.utils._errors.HfHubHTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/optimum/tiny_random_bert_neuronx/commit/main (Request ID: Root=1-
65b9dc79-623302c05667cdd945158891;2a91e984-3e94-4822-a5ab-55530c926cf9)                                                                                                                           
E           requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/optimum/tiny_random_bert_neuronx/commit/main                                    
E       AssertionError: 'is not supported yet' not found in "No module named 'fused_layer_norm_cuda'"                                                                                             
E       ModuleNotFoundError: No module named 'fused_layer_norm_cuda'                                                                                                                              
E   ModuleNotFoundError: No module named 'fused_layer_norm_cuda' 

Test results:

=========================== short test summary info ============================
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_dynamic_batching - s...
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_encoder_decoder - su...
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_encoder_decoder_optional_outputs
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_opt_level - subproce...
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_replace_unet - subpr...
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_stable_diffusion - s...
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_stable_diffusion_xl
FAILED tests/cli/test_export_cli.py::TestExportCLI::test_store_intemediary - ...
FAILED tests/cli/test_export_decoder_cli.py::test_export_decoder_cli[1-1-512-fp16]
FAILED tests/cli/test_export_decoder_cli.py::test_export_decoder_cli[1-2-128-bf16]
FAILED tests/cli/test_export_decoder_cli.py::test_export_decoder_cli[2-1-512-fp16]
FAILED tests/cli/test_export_decoder_cli.py::test_export_decoder_cli[2-2-128-bf16]
FAILED tests/exporters/test_export.py::NeuronStableDiffusionExportTestCase::test_export_for_stable_diffusion_models_0_hf_internal_testing_tiny_stable_diffusion_torch
FAILED tests/exporters/test_export.py::NeuronStableDiffusionExportTestCase::test_export_for_stable_diffusion_models_1_echarlaix_tiny_random_latent_consistency
FAILED tests/exporters/test_export.py::NeuronStableDiffusionExportTestCase::test_export_for_stable_diffusion_xl_models_0_echarlaix_tiny_random_stable_diffusion_xl
FAILED tests/exporters/test_export.py::NeuronEncoderDecoderExportTestCase::test_export_encoder_decoder_models_0_t5
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-BloomForCausalLM-1-100-2-fp32]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-BloomForCausalLM-1-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-BloomForCausalLM-2-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-gpt2-1-100-2-fp32]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-gpt2-1-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-gpt2-2-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-llama-1-100-2-fp32]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-llama-1-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-llama-2-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-MistralForCausalLM-1-100-2-fp32]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-MistralForCausalLM-1-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[dacorvo/tiny-random-MistralForCausalLM-2-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-OPTForCausalLM-1-100-2-fp32]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-OPTForCausalLM-1-100-2-fp16]
FAILED tests/generation/test_export.py::test_decoder_export[hf-internal-testing/tiny-random-OPTForCausalLM-2-100-2-fp16]
FAILED tests/generation/test_export.py::test_seq2seq_export[hf-internal-testing/tiny-random-t5-1-64-1]
FAILED tests/generation/test_export.py::test_seq2seq_export[hf-internal-testing/tiny-random-t5-1-64-4]
FAILED tests/generation/test_hub.py::test_decoder_model_from_hub[checkpoint]
FAILED tests/generation/test_hub.py::test_decoder_model_from_hub[no-checkpoint]
FAILED tests/generation/test_hub.py::test_push_decoder_to_hub - RuntimeError:...
FAILED tests/inference/test_modeling.py::NeuronModelForQuestionAnsweringIntegrationTest::test_load_vanilla_transformers_which_is_not_supported
FAILED tests/inference/test_modeling.py::NeuronModelForSequenceClassificationIntegrationTest::test_load_vanilla_transformers_which_is_not_supported
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionPipelineIntegrationTest::test_export_and_inference_dyn_0_stable_diffusion
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionPipelineIntegrationTest::test_export_and_inference_non_dyn_0_stable_diffusion
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionPipelineIntegrationTest::test_img2img_export_and_inference_0_stable_diffusion
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionPipelineIntegrationTest::test_inpaint_export_and_inference_0_stable_diffusion
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionPipelineIntegrationTest::test_lcm_export_and_inference_0_latent_consistency
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionXLPipelineIntegrationTest::test_export_and_inference_dyn_0_stable_diffusion_xl
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionXLPipelineIntegrationTest::test_export_and_inference_non_dyn_0_stable_diffusion_xl
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionXLPipelineIntegrationTest::test_img2img_export_and_inference_0_stable_diffusion_xl
FAILED tests/inference/test_stable_diffusion_pipeline.py::NeuronStableDiffusionXLPipelineIntegrationTest::test_inpaint_export_and_inference_0_stable_diffusion_xl
FAILED tests/pipelines/test_decoder_pipelines.py::test_export_no_parameters[hf-internal-testing/tiny-random-gpt2]
FAILED tests/pipelines/test_decoder_pipelines.py::test_export_no_parameters[dacorvo/tiny-random-llama]
FAILED tests/pipelines/test_decoder_pipelines.py::test_from_hub - RuntimeErro...
ERROR tests/generation/test_export.py::test_model_from_path[hf-internal-testing/tiny-random-BloomForCausalLM]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-BloomForCausalLM-sample]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-BloomForCausalLM-sample-with-temp]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-BloomForCausalLM-greedy]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-BloomForCausalLM-greedy_no-repeat]
ERROR tests/generation/test_generate.py::test_model_generation_input_dimensions[hf-internal-testing/tiny-random-BloomForCausalLM]
ERROR tests/generation/test_export.py::test_model_from_path[hf-internal-testing/tiny-random-gpt2]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-gpt2-sample]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-gpt2-sample-with-temp]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-gpt2-greedy]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-gpt2-greedy_no-repeat]
ERROR tests/generation/test_generate.py::test_model_generation_input_dimensions[hf-internal-testing/tiny-random-gpt2]
ERROR tests/generation/test_export.py::test_model_from_path[dacorvo/tiny-random-llama]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-llama-sample]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-llama-sample-with-temp]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-llama-greedy]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-llama-greedy_no-repeat]
ERROR tests/generation/test_generate.py::test_model_generation_input_dimensions[dacorvo/tiny-random-llama]
ERROR tests/generation/test_export.py::test_model_from_path[dacorvo/tiny-random-MistralForCausalLM]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-MistralForCausalLM-sample]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-MistralForCausalLM-sample-with-temp]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-MistralForCausalLM-greedy]
ERROR tests/generation/test_generate.py::test_decoder_generation[dacorvo/tiny-random-MistralForCausalLM-greedy_no-repeat]
ERROR tests/generation/test_generate.py::test_model_generation_input_dimensions[dacorvo/tiny-random-MistralForCausalLM]
ERROR tests/generation/test_export.py::test_model_from_path[hf-internal-testing/tiny-random-OPTForCausalLM]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-OPTForCausalLM-sample]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-OPTForCausalLM-sample-with-temp]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-OPTForCausalLM-greedy]
ERROR tests/generation/test_generate.py::test_decoder_generation[hf-internal-testing/tiny-random-OPTForCausalLM-greedy_no-repeat]
ERROR tests/generation/test_generate.py::test_model_generation_input_dimensions[hf-internal-testing/tiny-random-OPTForCausalLM]
ERROR tests/generation/test_export.py::test_seq2seq_model_from_path[hf-internal-testing/tiny-random-t5]
ERROR tests/generation/test_generate.py::test_seq2seq_generation_beam[hf-internal-testing/tiny-random-t5]
ERROR tests/generation/test_generate.py::test_seq2seq_generation_beam_with_optional_outputs[hf-internal-testing/tiny-random-t5]
ERROR tests/generation/test_generate.py::test_seq2seq_generation_greedy[hf-internal-testing/tiny-random-t5]
ERROR tests/generation/test_generate.py::test_seq2seq_generation_greedy_with_optional_outputs[hf-internal-testing/tiny-random-t5]
ERROR tests/generation/test_hub.py::test_push_seq2seq_to_hub[hf-internal-testing/tiny-random-t5]
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_decouple_weights_neff_and_replace_weight
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_cache
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_empty_cache
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_hub
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_hub_subfolder
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_hub_without_neuron_model
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_load_model_from_local_path
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_save_compiler_intermediary_files
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_save_model
ERROR tests/inference/test_modeling.py::NeuronModelIntegrationTest::test_trust_remote_code
ERROR tests/pipelines/test_decoder_pipelines.py::test_load_no_parameters[hf-internal-testing/tiny-random-gpt2]
ERROR tests/pipelines/test_decoder_pipelines.py::test_from_model_and_tokenizer[hf-internal-testing/tiny-random-gpt2]
ERROR tests/pipelines/test_decoder_pipelines.py::test_error_already_exported[hf-internal-testing/tiny-random-gpt2]
ERROR tests/pipelines/test_decoder_pipelines.py::test_load_no_parameters[dacorvo/tiny-random-llama]
ERROR tests/pipelines/test_decoder_pipelines.py::test_from_model_and_tokenizer[dacorvo/tiny-random-llama]
ERROR tests/pipelines/test_decoder_pipelines.py::test_error_already_exported[dacorvo/tiny-random-llama]
= 50 failed, 558 passed, 415 deselected, 43 warnings, 52 errors in 5915.65s (1:38:35) =
JingyaHuang commented 8 months ago

Hi @jeffhataws, thanks for reporting, could you provide more details about the version of neuron SDK that you are testing with? There are some checkers in optimum-neuron to detect dependencies versions, for example all stable diffusion features are available after neuronx-cc==2.6.*, and the latest neuronx-cc==2.12.68.0+4480452af for torch 2.1.2 in neuron SDK 2.17. I wonder if the neuronx-cc==2.0.37506.0a0+dedc3e172 that you are testing is for which version of neuron SDK?

HuggingFaceDocBuilderDev commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

HuggingFaceDocBuilderDev commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!

jeffhataws commented 5 months ago

Thanks @JingyaHuang for checking back. It was a beta version. We will retest with latest Optimum Neuron and compiler.

jeffhataws commented 5 months ago

Replaced by more up-to-date https://github.com/huggingface/optimum-neuron/issues/597