migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

Check top huggingface models #162

Closed attila-dusnoki-htec closed 8 months ago

attila-dusnoki-htec commented 9 months ago

There are a bunch of models in huggingface that would be good to test if it compiles and accurate.

The most downloaded onnx models would be a good start: https://huggingface.co/models?library=onnx&sort=downloads

We can use optimum-cli to help with creating the onnx files. Also ONNX Zoo got updated with a bunch of new models, which already in onnx format that might overlap with this.

attila-dusnoki-htec commented 9 months ago

This is a script for scraping hugging face to download onnx models.

It will be extended to compile/verify models with migraphx. Probably we will use a logic like this: download, test, log, remove to avoid running out of space.

attila-dusnoki-htec commented 9 months ago

TODO: update the script to download weights if separate. e.g.: model.onnx has model.onnx_data or shared.weight, etc

attila-dusnoki-htec commented 9 months ago

MIGraphX Compile results

The first dump of issues with about ~640 models from HF.

User errors, requires further checking

Missing weight file

/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: ./weights.pb                                                                                                           
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model.onnx_data                                                                                                
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_merged.onnx_data                                                                                         
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_merged_quantized.onnx_data                                                                               
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_optimized.onnx.data                                                                                      
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_with_past_model.onnx_data                                                                                      
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_with_past_model_optimized.onnx.data                                                                            
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: encoder_model.onnx_data                                                                                                
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.data                                                                                                             
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.onnx.data                                                                                                        
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.onnx_data                                                                                                        
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model_optimized.onnx.data                                                                                              
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: openvino_model.onnx_data 
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: shared.weight
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: transformer.embeddings.word_embeddings.weight
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: openskyml_overall-v1_blob_main_overall-v1.onnx

Input dims not provided

/code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:220: same_dims: add: Dimensions do not match
/code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:220: same_dims: dequantizelinear: Dimensions do not match
/code/AMDMIGraphX/src/include/migraphx/op/concat.hpp:98: normalize_compute_shape: CONCAT: all input dimensions should match along axis 2
/code/AMDMIGraphX/src/include/migraphx/op/convolution.hpp:100: normalize_compute_shape: CONVOLUTION: mismatched channel numbers
/code/AMDMIGraphX/src/include/migraphx/op/multibroadcast.hpp:99: compute_shape: MULTIBROADCAST: input shape {1, 0} cannot be broadcasted to {1, 1}!

Compile issues

Parse error

/code/AMDMIGraphX/src/include/migraphx/op/quant_convolution.hpp:96: normalize_compute_shape: QUANT_CONVOLUTION: only accept input and weights of type int8_t or fp8e4m3fnuz_type
/code/AMDMIGraphX/src/onnx/checks.cpp:35: check_arg_empty: PARSE_RANGE: limit arg dynamic shape is not supported
/code/AMDMIGraphX/src/onnx/checks.cpp:35: check_arg_empty: PARSE_RANGE: start arg dynamic shape is not supported
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:407: parse_graph: PARSE_GRAPH: invalid onnx file. Input "514" is unavailable due to unordered nodes!
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:407: parse_graph: PARSE_GRAPH: invalid onnx file. Input "ResNet::input_0_transpose" is unavailable due to unordered nodes!
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Attention
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: BiasGelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Einsum
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: EmbedLayerNormalization
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: FastGelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Gelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: QAttention
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: SkipLayerNormalization
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:640: get_type: Prototensor data type 8 not supported
/code/AMDMIGraphX/src/onnx/parse_if.cpp:92: parse: PARSE_IF: If_* then and else sub_grahps must have same output shapes!
/code/AMDMIGraphX/src/onnx/parse_pad.cpp:182: parse_constant_value: PARSE_PAD: `value` should contain only one element

GPU specific

/code/AMDMIGraphX/src/targets/gpu/compile_hip.cpp:171: compile: hiprtc: HIPRTC_ERROR_COMPILATION: Compilation failed.
/code/AMDMIGraphX/src/targets/gpu/compile_ops.cpp:157: benchmark: No configs to tune
/code/AMDMIGraphX/src/targets/gpu/gemm_impl.cpp:126: operator(): rocblas_invoke: rocBLAS call failed with status 4
std::bad_alloc for oliverguhr_wav2vec2-large-xlsr-53-german-cv9_blob_main_onnx_model.onnx
attila-dusnoki-htec commented 9 months ago

MIGraphX compilation success

Note: the "/" were repalces with "_" in model paths

Compilation

The following models compiled successfully with both fp32 and fp16 (except one)

197 unique HF model repo, containing 324 onnx models

Only FP32

FP32 and FP16

attila-dusnoki-htec commented 9 months ago

MIGraphx verify results (OUTDATED)

There are a bunch of FP32 fails with Memory access fault by GPU node-2 (Agent handle: ...) on address 0x... Reason: Unknown., but not with FP16. Or fail with both. Those have to be re-tested.

Success

Note: there are cases where both ref and target were zeros, which should be rechecked.

FP32

FP16

Fails

FP32

Segmentation fault

530 model verify failed due to Segmentation fault. (both fp32 and fp16 counted in 530) These probably have int input and does not handle properly.

From a quick filtering, there are the possible input params that have to be handled (e.g. --fill1):

attention_mask = @param:attention_mask -> int64_type, {1, 1024}, {1024, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 1}, {1, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 2048}, {2048, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 512}, {512, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 9}, {9, 1}, target_id=0
bbox = @param:bbox -> int64_type, {1, 1, 4}, {4, 4, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 1024}, {1024, 1024, 1024, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 2048}, {2048, 2048, 2048, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 512}, {512, 512, 512, 1}, target_id=0
decoder_input_ids = @param:decoder_input_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
encoder_attention_mask = @param:encoder_attention_mask -> int64_type, {1, 1}, {1, 1}, target_id=0
encoding_indices = @param:encoding_indices -> int64_type, {1, 8, 16}, {128, 16, 1}, target_id=0
hashed_ids = @param:hashed_ids -> int64_type, {1, 1, 8}, {8, 8, 1}, target_id=0
input_ids = @param:input_ids -> int32_type, {1, 1}, {1, 1}, target_id=0
input_ids = @param:input_ids -> int32_type, {1, 77}, {77, 1}, target_id=0
input_ids = @param:input_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
input_ids = @param:input_ids -> int64_type, {1, 9}, {9, 1}, target_id=0
past_sequence_length = @param:past_sequence_length -> int32_type, {1}, {0}, target_id=0
position_ids = @param:position_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
positions = @param:positions -> int64_type, {1, 1}, {1, 1}, target_id=0
text = @param:text -> int32_type, {1, 77}, {77, 1}, target_id=0
text = @param:text -> int64_type, {1}, {1}, target_id=0
token_type_ids = @param:token_type_ids -> int64_type, {1, 1}, {1, 1}, target_id=0

Note: They occured 4323 times, which is a lot.

attila-dusnoki-htec commented 9 months ago

MIGraphX verify results v2

This round, we tested 319 models. (638 verify runs fp32 + fp16)

attila-dusnoki-htec commented 8 months ago

All issues have been reported and have dedicated tracking issues. Closing this