rmihaylov / mpttune

Tune MPTs
Apache License 2.0
84 stars 16 forks source link

mpt-7b and mpt-7b-instruct - fail to start in colab. #5

Open sidharthiimc opened 1 year ago

sidharthiimc commented 1 year ago

Only mpt-7b-storywriter-4bit worked in colab.

Other models don't proceed ahead. CPU RAM goes up and down and then Fuss! End

Note: Colab doesn't crashes.

image

rmihaylov commented 1 year ago

This repo is targeted towards 4-bit model like mpt-7b-storywriter-4bit which require less RAM. The other models could be trained in 8-bit, and they require more RAM. If you are interested to train the other models in 4-bit you could use HuggingFace's Qlora. However, the 4-bit inference in this repo is 5-7 times faster. Basically, the Qlora inference does not work as fast.

sidharthiimc commented 1 year ago

It will be great if you could add a jupyter notebook version of this repo with all the parameter to paly around.