Closed attila-dusnoki-htec closed 8 months ago
TODO: update the script to download weights if separate. e.g.: model.onnx has model.onnx_data
or shared.weight
, etc
The first dump of issues with about ~640 models from HF.
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: ./weights.pb
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_merged.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_merged_quantized.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_model_optimized.onnx.data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_with_past_model.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: decoder_with_past_model_optimized.onnx.data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: encoder_model.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.onnx.data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: model_optimized.onnx.data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: openvino_model.onnx_data
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: shared.weight
/code/AMDMIGraphX/src/file_buffer.cpp:51: generic_read_file: Error reading file: transformer.embeddings.word_embeddings.weight
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: openskyml_overall-v1_blob_main_overall-v1.onnx
/code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:220: same_dims: add: Dimensions do not match
/code/AMDMIGraphX/src/include/migraphx/check_shapes.hpp:220: same_dims: dequantizelinear: Dimensions do not match
/code/AMDMIGraphX/src/include/migraphx/op/concat.hpp:98: normalize_compute_shape: CONCAT: all input dimensions should match along axis 2
/code/AMDMIGraphX/src/include/migraphx/op/convolution.hpp:100: normalize_compute_shape: CONVOLUTION: mismatched channel numbers
/code/AMDMIGraphX/src/include/migraphx/op/multibroadcast.hpp:99: compute_shape: MULTIBROADCAST: input shape {1, 0} cannot be broadcasted to {1, 1}!
/code/AMDMIGraphX/src/include/migraphx/op/quant_convolution.hpp:96: normalize_compute_shape: QUANT_CONVOLUTION: only accept input and weights of type int8_t or fp8e4m3fnuz_type
/code/AMDMIGraphX/src/onnx/checks.cpp:35: check_arg_empty: PARSE_RANGE: limit arg dynamic shape is not supported
/code/AMDMIGraphX/src/onnx/checks.cpp:35: check_arg_empty: PARSE_RANGE: start arg dynamic shape is not supported
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:407: parse_graph: PARSE_GRAPH: invalid onnx file. Input "514" is unavailable due to unordered nodes!
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:407: parse_graph: PARSE_GRAPH: invalid onnx file. Input "ResNet::input_0_transpose" is unavailable due to unordered nodes!
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Attention
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: BiasGelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Einsum
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: EmbedLayerNormalization
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: FastGelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: Gelu
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: QAttention
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:419: parse_graph: Unknown operator: SkipLayerNormalization
/code/AMDMIGraphX/src/onnx/onnx_parser.cpp:640: get_type: Prototensor data type 8 not supported
/code/AMDMIGraphX/src/onnx/parse_if.cpp:92: parse: PARSE_IF: If_* then and else sub_grahps must have same output shapes!
/code/AMDMIGraphX/src/onnx/parse_pad.cpp:182: parse_constant_value: PARSE_PAD: `value` should contain only one element
/code/AMDMIGraphX/src/targets/gpu/compile_hip.cpp:171: compile: hiprtc: HIPRTC_ERROR_COMPILATION: Compilation failed.
/code/AMDMIGraphX/src/targets/gpu/compile_ops.cpp:157: benchmark: No configs to tune
/code/AMDMIGraphX/src/targets/gpu/gemm_impl.cpp:126: operator(): rocblas_invoke: rocBLAS call failed with status 4
std::bad_alloc for oliverguhr_wav2vec2-large-xlsr-53-german-cv9_blob_main_onnx_model.onnx
Note: the "/" were repalces with "_" in model paths
The following models compiled successfully with both fp32 and fp16 (except one)
197 unique HF model repo, containing 324 onnx models
There are a bunch of FP32 fails with Memory access fault by GPU node-2 (Agent handle: ...) on address 0x... Reason: Unknown.
, but not with FP16. Or fail with both. Those have to be re-tested.
Note: there are cases where both ref and target were zeros, which should be rechecked.
530 model verify failed due to Segmentation fault
. (both fp32 and fp16 counted in 530)
These probably have int input and does not handle properly.
From a quick filtering, there are the possible input params that have to be handled (e.g. --fill1):
attention_mask = @param:attention_mask -> int64_type, {1, 1024}, {1024, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 1}, {1, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 2048}, {2048, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 512}, {512, 1}, target_id=0
attention_mask = @param:attention_mask -> int64_type, {1, 9}, {9, 1}, target_id=0
bbox = @param:bbox -> int64_type, {1, 1, 4}, {4, 4, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 1024}, {1024, 1024, 1024, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 2048}, {2048, 2048, 2048, 1}, target_id=0
causal_mask = @param:causal_mask -> int64_type, {1, 1, 1, 512}, {512, 512, 512, 1}, target_id=0
decoder_input_ids = @param:decoder_input_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
encoder_attention_mask = @param:encoder_attention_mask -> int64_type, {1, 1}, {1, 1}, target_id=0
encoding_indices = @param:encoding_indices -> int64_type, {1, 8, 16}, {128, 16, 1}, target_id=0
hashed_ids = @param:hashed_ids -> int64_type, {1, 1, 8}, {8, 8, 1}, target_id=0
input_ids = @param:input_ids -> int32_type, {1, 1}, {1, 1}, target_id=0
input_ids = @param:input_ids -> int32_type, {1, 77}, {77, 1}, target_id=0
input_ids = @param:input_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
input_ids = @param:input_ids -> int64_type, {1, 9}, {9, 1}, target_id=0
past_sequence_length = @param:past_sequence_length -> int32_type, {1}, {0}, target_id=0
position_ids = @param:position_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
positions = @param:positions -> int64_type, {1, 1}, {1, 1}, target_id=0
text = @param:text -> int32_type, {1, 77}, {77, 1}, target_id=0
text = @param:text -> int64_type, {1}, {1}, target_id=0
token_type_ids = @param:token_type_ids -> int64_type, {1, 1}, {1, 1}, target_id=0
Note: They occured 4323 times, which is a lot.
This round, we tested 319
models. (638 verify
runs fp32 + fp16)
439
test run correctly (fp32 + fp16)
204
passed 16
failed fp32 verify
176
passed 43
failed fp16 verify
.
199
had compile issues
1
model difference, where fp16
fails with no config
issue (csarron/mobilebert-uncased-squad-v2
)All issues have been reported and have dedicated tracking issues. Closing this
There are a bunch of models in huggingface that would be good to test if it compiles and accurate.
The most downloaded onnx models would be a good start: https://huggingface.co/models?library=onnx&sort=downloads
We can use optimum-cli to help with creating the onnx files. Also ONNX Zoo got updated with a bunch of new models, which already in onnx format that might overlap with this.