migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

Verify model support #176

Closed attila-dusnoki-htec closed 3 months ago

attila-dusnoki-htec commented 4 months ago

Run driver verify for the following models:

attila-dusnoki-htec commented 4 months ago

GPT-J

Note: seq_len 64 is used for memory concerns, use larger (up to 2048) if you can.

Donwload model:

optimum-cli export onnx --model EleutherAI/gpt-j-6B --sequence_length 64 --batch_size 1 --task text-generation --no-dynamic-axes gpt-j-6b/

Verify (fp32)

/code/AMDMIGraphX/build/bin/driver verify gpt-j-6b/model.onnx --fill1 input_ids --fill1 position_ids --fill1 attention_mask

Full log: gptj6b_fp32_verify.log

Verify (fp16)

/code/AMDMIGraphX/build/bin/driver verify gpt-j-6b/model.onnx --fill1 input_ids --fill1 position_ids --fill1 attention_mask --fp16

Full log: gptj6b_fp16_verify.log

attila-dusnoki-htec commented 4 months ago

Llamav2-7B

Note: meta-llama/Llama-2-7b-hf requires access Note: seq_len 64 is used for memory concerns, use larger (up to 2048) if you can.

Download model:

optimum-cli export onnx --model meta-llama/Llama-2-7b-hf --sequence_length 64 --batch_size 1 --no-dynamic-axes --task text-generation llama2-7b/

Verify (fp32)

/code/AMDMIGraphX/build/bin/driver verify llama2-7b/model.onnx --fill1 input_ids --fill1 position_ids --fill1 attention_mask

Full log: llamav2_7b_fp32_verify.log

Verify (fp16)

/code/AMDMIGraphX/build/bin/driver verify llama2-7b/model.onnx --fill1 input_ids --fill1 position_ids --fill1 attention_mask --fp16

~Full log: llamav2_7b_fp16_verify.log~

~> Note: It fails! Know issue, will be fixed with rmsnorm fp32 propogation~ Fixed on latest develop

Full log: llamav2_7b_fp16_verify.v2.log

attila-dusnoki-htec commented 4 months ago

Stable Diffusion 2.1

Note: Tracking issue #2555

Download models:

optimum-cli export onnx --model stabilityai/stable-diffusion-2-1 --batch_size 1 --sequence_length 64 --no-dynamic-axes --task stable-diffusion sd21/

Verify (f32)

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/text_encoder/model.onnx --fill1 input_ids

Full log: sd21_text_encoder_fp32_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/unet/model.onnx --fill1 timestep

Full log: sd21_unet_fp32_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/vae_decoder/model.onnx

Full log: sd21_vae_decoder_fp32_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/vae_encoder/model.onnx

RMS Error: 0.0024192
Max diff: 4.92969
Mismatch at 0: -7.89602 != -7.63092

Full log: sd21_vae_encoder_fp32_verify.log

Verify (fp16)

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/text_encoder/model.onnx --fill1 input_ids --fp16

Full log: sd21_text_encoder_fp16_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/unet/model.onnx --fill1 timestep --fp16

RMS Error: nan
Max diff: nan
Mismatch at 0: -0.065396 != nan
Non finite number found in target at 0: nan

Full log: sd21_unet_fp16_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/vae_decoder/model.onnx --fp16

RMS Error: nan
Max diff: nan
Mismatch at 0: -0.0455025 != nan
Non finite number found in target at 0: nan

Full log: sd21_vae_decoder_fp16_verify.log

/code/AMDMIGraphX/build/bin/driver verify tmp/sd21/vae_encoder/model.onnx --fp16

Full log: sd21_vae_encoder_fp16_verify.log

attila-dusnoki-htec commented 3 months ago

Bert

Download model:

optimum-cli export onnx --model google-bert/bert-large-uncased --batch_size 1 --sequence_length 384 --no-dynamic-axes bert-large/

Verify (fp32)

/code/AMDMIGraphX/build/bin/driver verify bert-large/model.onnx --fill1 token_type_ids --fill1 attention_mask --fill1 input_ids

Full log: bert_large_fp32_verify.log

Verify (fp16)

/code/AMDMIGraphX/build/bin/driver verify bert-large/model.onnx --fill1 token_type_ids --fill1 attention_mask --fill1 input_ids --fp16

Full log: bert_large_fp16_verify.log

attila-dusnoki-htec commented 3 months ago

ResNet50

Download model:

optimum-cli export onnx --model microsoft/resnet-50 --batch_size 1 --no-dynamic-axes resnet50/

Verify (fp32)

/code/AMDMIGraphX/build/bin/driver verify resnet50/model.onnx

Full log: resnet50_fp32_verify.log

Verify (fp16)

/code/AMDMIGraphX/build/bin/driver verify resnet50/model.onnx --fp16

Full log: resnet50_fp16_verify.log