Open malfet opened 4 months ago
python3 torchchat.py generate stories110M --dtype float16
This runs --device fast which translates into MPS. You might specify --device cpu
. Also if CPU is faster than MPS, we should drop it from the devices selected for device "fast"
python3 torchchat.py generate stories110M --dtype float16
This runs --device fast which translates into MPS. You might specify
--device cpu
. Also if CPU is faster than MPS, we should drop it from the devices selected for device "fast"
No, it was not the case until https://github.com/pytorch/torchchat/pull/694 was landed, but let me clarify that
why is the compile tok/sec lower than the eager tok/sec
torch==2.4.0.dev20240502
on Apple M2 pro I get following numbers for stories110M + float16 dtypeCommands to reproduce:
and for torchchat