pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch
https://pytorch.org/executorch/
Other
2.12k stars 350 forks source link

Executorch exported model produces gibberish: stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' #3542

Open mikekgfb opened 6 months ago

mikekgfb commented 6 months ago

stories15M produces gibberish with embedding quantization and a8w4dq on macOS/ARM. Integration issue maybe? (Since this is the workhorse for mobile, with. ARM?!)

https://github.com/pytorch/torchchat/actions/runs/8997932498/job/24717027755?pr=718 (at bottom)

========================================

Average tokens/sec: 19.35 Memory used: 0.00 GB

mikekgfb commented 5 months ago

This is the same sequence as https://github.com/pytorch/executorch/issues/3588 which segfaults.