Error Compiling SpinQuant in QnnBackend

Vinaysukhesh98 commented 2 weeks ago

🐛 Describe the bug

Traceback (most recent call last): File "anaconda3/envs/et_qnn/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "anaconda3/envs/et_qnn/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "Downloads/executorch/examples/models/llama/export_llama.py", line 30, in main() # pragma: no cover File "Downloads/executorch/examples/models/llama/export_llama.py", line 26, in main export_llama(modelname, args) File "Downloads/executorch/examples/models/llama/export_llama_lib.py", line 485, in export_llama builder = _export_llama(modelname, args) File "Downloads/executorch/examples/models/llama/export_llama_lib.py", line 591, in _export_llama _prepare_for_llama_export(modelname, args) File "Downloads/executorch/examples/models/llama/export_llama_lib.py", line 517, in _prepare_for_llama_export _load_llama_model( File "Downloads/executorch/examples/models/llama/export_llama_lib.py", line 812, in _load_llama_model model, example_inputs, example_kwarginputs, = EagerModelFactory.create_model( File "Downloads/executorch/examples/models/model_factory.py", line 44, in create_model model = modelclass(**kwargs) File "Downloads/executorch/examples/models/llama/model.py", line 207, in init missing, unexpected = self.model.load_state_dict( File "anaconda3/envs/et_qnn/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2584, in load_state_dict raise RuntimeError( RuntimeError: Error(s) in loading state_dict for Transformer: While copying the parameter named "tok_embeddings.weight", whose dimensions in the model are torch.Size([128256, 3072]) and whose dimensions in the checkpoint are torch.Size([128256, 3072]), an exception occurred : ('Only Tensors of floating point and complex dtype can require gradients',).

Versions

wget https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py --2024-11-11 16:29:08-- https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8002::154, 2606:50c0:8001::154, 2606:50c0:8003::154, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 24366 (24K) [text/plain] Saving to: ‘collect_env.py’

collect_env.py 100%[================================================================================================================>] 23.79K --.-KB/s in 0s

2024-11-11 16:29:09 (58.5 MB/s) - ‘collect_env.py’ saved [24366/24366]

JacobSzwejbka commented 2 weeks ago

cc @cccclai @helunwencser on QNN Spinquant

Vinaysukhesh98 commented 4 days ago

HI @cccclai , did you get any answer for this error?

cccclai commented 1 day ago

Hi @Vinaysukhesh98 , are you using llama3 8b models following this instruction? https://pytorch.org/executorch/stable/llm/build-run-llama3-qualcomm-ai-engine-direct-backend.html

In the meanwhile, are you using main branch or stable branch?

Vinaysukhesh98 commented 7 hours ago

Hi @Vinaysukhesh98 , are you using llama3 8b models following this instruction? https://pytorch.org/executorch/stable/llm/build-run-llama3-qualcomm-ai-engine-direct-backend.html

In the meanwhile, are you using main branch or stable branch?

main branch

cccclai commented 6 hours ago

Can you try the release branch https://github.com/pytorch/executorch/tree/release/0.4? Maybe there is some recent change. The error message means there are some mix dtypes. The workaround can be use .to syntax to convert the dtype so they're matched.

pytorch / executorch

Error Compiling SpinQuant in QnnBackend #6748

🐛 Describe the bug

Versions