From pretrained normally?

johnml1135 commented 1 year ago

So, it takes 4 minutes to build all the weights even of a 600MB distilled model on my RTX3090. If I am correct (I may not be), we should be able to cache checkpoints at position 0 for the NLLB models - which could dramatically reduce that startup time. That would be very helpful for debugging quick builds and running E2E testing. I am unsure exactly the code change to make, but the idea would be something like:

in hugging_face_model_trainer when about to make the Seq2SeqTrainer:
- First, check if the model is a string (is this how it comes in when starting fresh?)
- If so, check if there is a cached version of the model hanging out in a cache folder.
- If not create the model and save the cache
- Keep going

I am unsure whether there would need to be a separate cached version for each project (undesirable) or if it could be one per NLLB model type.

I could be going about this wrong, but I saw some things that looked similar to these ideas but nothing slam-dunk.

johnml1135 commented 12 months ago

@ddaspit do you have any insight into this? It could dramatically reduce the "10 step" build time from 6 minutes to 2 minutes.

ddaspit commented 12 months ago

I have no idea if this is possible. I am not aware of a way to do this in Huggingface or PyTorch. I think we would need to do more investigation to determine the exact cause of the long startup time.

johnml1135 commented 12 months ago

This may be of help - https://github.com/huggingface/transformers/issues/21913.

sillsdev / machine.py

From pretrained normally? #64