Open aleph65 opened 1 year ago
Hey @aleph65 ,
Is this still an issue you're interested in solving? If so, you could try using the torch.onnx.dynamo_export()
api to attempt to export, and if that doesn't work, you could also try fake mode.
Let me know if you have any success! Will otherwise close as stale in 2-3 weeks :)
🐛 Describe the bug
Unable to convert 30B model to ONNX. I am using 4x A100's , 500GB RAM, 2.5TB Memory, still running out of memory.
Here's the repro:
I believe this is reproable in any container, but here's the container setup step:
1) Create a container on Runpod from winglian/axolotl-runpod:main-py3.9-cu118-2.0.0
Then deploy 4x A100 in Secure cloud, search for the Template just created:
2) Once it loads, start the terminal and:
3) Paste the following inference file using vim:
Paste this:
To exit vim, Esc -> Shift + Z -> Shift + Z
4) Now, run the conversion:
python fp16_to_onnx.py WizardLM-30B-Uncensored
This will take about 45 minutes, which already sounds a bit wrong as it should take 5m. gpt2 takes 30 seconds to convert.
Then , it will fail with this:
Can you please help unblock? I have been trying to convert this to ONNX for days already
Many thanks
Versions