google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.
Apache License 2.0
354 stars 50 forks source link

Error occurs when exporting tinyllama to tflite with num_layers=1 #314

Open hayyaw opened 1 week ago

hayyaw commented 1 week ago

Description of the bug:

To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Traceback (most recent call last): File "/home/wangzhiqun/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/convert_to_tflite.py", line 80, in app.run(main) File "/home/wangzhiqun/miniconda3/envs/torch2tflite_3.11/lib/python3.11/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/wangzhiqun/miniconda3/envs/torch2tflite_3.11/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) ^^^^^^^^^^ File "/home/wangzhiqun/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/convert_to_tflite.py", line 65, in main pytorch_model = tiny_llama.build_model( ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/wangzhiqun/ai-edge-torch/ai_edge_torch/generative/examples/tiny_llama/tiny_llama.py", line 79, in build_model return model_builder.build_decoder_only_model( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/wangzhiqun/ai-edge-torch/ai_edge_torch/generative/utilities/model_builder.py", line 137, in build_decoder_only_model loader.load( File "/home/wangzhiqun/ai-edge-torch/ai_edge_torch/generative/utilities/loader.py", line 188, in load raise ValueError( ValueError: Failed to map all tensor. Remaing tensor are: ['model.layers.1.input_layernorm.weight', 'model.layers.1.mlp.down_proj.weight', 'model.layers.1.mlp.gate_proj.weight', 'model.layers.1.mlp.up_proj.weight', .........

Actual vs expected behavior:

main master commit ddb7bf76d5343787cb4ad2780a5f194bf5b646fd

cd ai_edge_torch/generative/examples/tiny_llama config = cfg.ModelConfig( vocab_size=32000, num_layers=1, max_seq_len=2048, embedding_dim=2048, kv_cache_max_len=kv_cache_max_len, block_configs=block_config, final_norm_config=norm_config, lm_head_share_weight_with_embedding=False, enable_hlfb=True, )

python convert_to_tflite.py

Any other information you'd like to share?

No response

hayyaw commented 1 week ago

tinyllama model: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0

pkgoogle commented 1 week ago

Hi @hayyaw, I think because the checkpoint you are using have additional tensors (i.e. only the default 22 layers) it is failing. Did you have a checkpoint in mind for 1 layer? or did you only want to port 1 of the layers?

hayyaw commented 1 week ago

I only want to port 1 of the layers. I remember it could be exported with 1 layer about 3 months ago. After I update the main branch, it failed. How can I export it successfully with num_layers=1 for tinyllama model checkpoint(22layers)( https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)? Looking forward to your reply. Thanks. @pkgoogle

haozha111 commented 1 week ago

if you just run the export with num_layers=1, but you are using the original 22-layer tinyllama checkpoint, then export will fail. This is b/c the checkpoint and the model isn't matching. You can always export a single layer tinyllama with random weights though (skipping checkpoint loading part).

pkgoogle commented 1 week ago

Hi @hayyaw, you can make it work like @haozha111 says, but you will get a poor performing model with random weights... if you only port 1 layer it will probably be the same. You can construct your own 1-layer version and train/fine-tune that, then convert that and that will behave better, or you can train post-conversion but most users don't train on device, but if you want to go that route we can try.

github-actions[bot] commented 3 days ago

Marking this issue as stale since it has been open for 7 days with no activity. This issue will be closed if no further activity occurs.