Loading local .pt file with from_pretrained failes cause of domino chain of errors.

OFA-Sys / ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Apache License 2.0

981 stars 64 forks source link

Loading local .pt file with from_pretrained failes cause of domino chain of errors. #51

Closed gilkzxc closed 1 month ago

gilkzxc commented 8 months ago

By using the "http://one-peace-shanghai.oss-accelerate.aliyuncs.com/one-peace.pt", and from_pretrained(). I failed to load the pretrained model. As it's prints "Killed". So I have traced deeper to the location of the bug. Led me to try checkpoint_utils.load_model_ensemble_and_task(), which also failed. Eventually, I have found that an AssertionError is been raised in line 43 setup_task() "task is not None"

logicwong commented 8 months ago

It might be because you haven't installed fairseq under this repo. Try:

pip uninstall fairseq
git clone https://github.com/OFA-Sys/ONE-PEACE
cd ONE-PEACE
pip install -r requirements.txt

gilkzxc commented 8 months ago

But I did install fairseq in the same directory.

logicwong commented 8 months ago

Can you share your execution command?

gilkzxc commented 8 months ago

That's just shows "Killed". So I used the equivalent calls that from_pretraind() do. And then load_model.... and etc..

logicwong commented 8 months ago

Could it be because there's not enough memory? Try this: model = from_pretrained("one-peace.pt", device=device, dtype="float16")

gilkzxc commented 8 months ago

Tried with dtype="float16" and "float8", same result "Killed".

gilkzxc commented 8 months ago

Even reinstalled omegaconf again to be sure it's 2.0.6 as required.

logicwong commented 8 months ago

Can you switch to a machine with larger memory, and then try using model = from_pretrained("one-peace.pt", device='cpu', dtype="float16")? I haven't encountered this situation before.