Open amritap-ef opened 8 months ago
Hi, you can refer these docs and code examples:
for adding new unsupported models:
for loading local huggingface checkpoints, you can specify the absolute directory path in pipeline
Thanks for sending this through - apologies I didn't explain this very well.
What I actually was asking about is if there is a a way to reduce loading time for your own finetuned models from HuggingFace checkpoints, as I'm finding that loading in the default models seems to be much faster.
In particular, this PR https://github.com/microsoft/DeepSpeed/pull/4664 references adding the 'capability to snapshot an engine and resume from it' - hence I was wondering how I may save and load that engine so as to reduce the time taken to load a non-persistent pipeline the first time?
Hi,
I saw this pull request in the DeepSpeed library about snapshotting an engine to be able to load in large models faster but I couldn't see any documentation on this: https://github.com/microsoft/DeepSpeed/pull/4664
How can I save and load in inference checkpoints on my own model faster with DeepSpeed-fastergen?
04/03/24: Updated issue description to be clearer