pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch
https://pytorch.org/executorch/
Other
2.21k stars 368 forks source link

Pte export of llama model #5751

Open Vinaysukhesh98 opened 1 month ago

Vinaysukhesh98 commented 1 month ago

🐛 Describe the bug

python -m examples.models.llama2.export_llama --checkpoint "${MODEL_DIR}/model.pth" -p "${MODEL_DIR}/original/params.json" -kv --disable_dynamic_shape --qnn --pt2e_quantize qnn_16a4w -d fp32 --metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}' --soc_model SM8650 --output_name="test.pte"

nomodule error ModuleNotFoundError: No module named 'executorch.backends.qualcomm.python'

Versions

python -m examples.models.llama2.export_llama --checkpoint "${MODEL_DIR}/model.pth" -p "${MODEL_DIR}/original/params.json" -kv --disable_dynamic_shape --qnn --pt2e_quantize qnn_16a4w -d fp32 --metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}' --soc_model SM8650 --output_name="test.pte" /Documents/vinay/executorch/examples/models/llama2/model.py:102: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. checkpoint = torch.load(checkpoint_path, map_location=device, mmap=True) Traceback (most recent call last): File " /anaconda3/envs/et_qnn/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File " /anaconda3/envs/et_qnn/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File " /Documents/vinay/executorch/examples/models/llama2/export_llama.py", line 30, in main() # pragma: no cover File " /Documents/vinay/executorch/examples/models/llama2/export_llama.py", line 26, in main export_llama(modelname, args) File " /Documents/vinay/executorch/examples/models/llama2/export_llama_lib.py", line 449, in export_llama builder = _export_llama(modelname, args) File " /Documents/vinay/executorch/examples/models/llama2/export_llama_lib.py", line 549, in _export_llama _prepare_for_llama_export(modelname, args) File " /Documents/vinay/executorch/examples/models/llama2/export_llama_lib.py", line 505, in _prepare_for_llama_export .source_transform(_get_source_transforms(modelname, dtype_override, args)) File " /Documents/vinay/executorch/examples/models/llama2/export_llama_lib.py", line 886, in _get_source_transforms from executorch.backends.qualcomm.utils.utils import ( File " /anaconda3/envs/et_qnn/lib/python3.10/site-packages/executorch/backends/qualcomm/utils/utils.py", line 12, in import executorch.backends.qualcomm.python.PyQnnManagerAdaptor as PyQnnManagerAdaptor ModuleNotFoundError: No module named 'executorch.backends.qualcomm.python'

iseeyuan commented 1 month ago

@cccclai , could you help look at this?

cccclai commented 1 month ago

Did you run these two commands?

cp -f backends/qualcomm/PyQnnManagerAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python
cp -f backends/qualcomm/PyQnnWrapperAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python

Full instruction here https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend.html

Vinaysukhesh98 commented 1 month ago

Did you run these two commands?

cp -f backends/qualcomm/PyQnnManagerAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python
cp -f backends/qualcomm/PyQnnWrapperAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python

Full instruction here https://pytorch.org/executorch/main/build-run-qualcomm-ai-engine-direct-backend.html

Could you help me with llama3.2 model export command in int 4quant format