NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
9.94k stars 2.25k forks source link

How to load finetuned DPR model for MSDP preprocessing? #179

Closed singleheart closed 2 years ago

singleheart commented 2 years ago

Hi, I want to run MSDP task, and it requires a finetuned DPR model. https://github.com/NVIDIA/Megatron-LM/blob/9a8b89acd8f6ba096860170d0e30ddc0bc2bacd4/examples/msdp/data_processing.sh#L49

ParlAI does provides several DPR models that are finetuned with Wizard Of Wikipedia dataset. (https://parl.ai/docs/zoo.html#wizard-of-wikipedia-models)

I've downloaded some models from there but it seems that preprocessing.py could not load them. Perhaps parlai does not provide config file for their models, but just checkpoint files. For example, https://parl.ai/docs/zoo.html#multiset-dpr-model contains a cp file only. I wonder how to load it from Megatron-LM. https://github.com/NVIDIA/Megatron-LM/blob/9a8b89acd8f6ba096860170d0e30ddc0bc2bacd4/tasks/msdp/preprocessing.py#L391 This code just invokes some error messages. Maybe a model class should be defined here.

Mrwangkedong commented 2 years ago

Hi, I want to run MSDP task, and it requires a finetuned DPR model.

https://github.com/NVIDIA/Megatron-LM/blob/9a8b89acd8f6ba096860170d0e30ddc0bc2bacd4/examples/msdp/data_processing.sh#L49

ParlAI does provides several DPR models that are finetuned with Wizard Of Wikipedia dataset. (https://parl.ai/docs/zoo.html#wizard-of-wikipedia-models)

I've downloaded some models from there but it seems that preprocessing.py could not load them. Perhaps parlai does not provide config file for their models, but just checkpoint files. For example, https://parl.ai/docs/zoo.html#multiset-dpr-model contains a cp file only. I wonder how to load it from Megatron-LM.

https://github.com/NVIDIA/Megatron-LM/blob/9a8b89acd8f6ba096860170d0e30ddc0bc2bacd4/tasks/msdp/preprocessing.py#L391

This code just invokes some error messages. Maybe a model class should be defined here.

i konw how to do it. `from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/dpr-question_encoder-single-nq-base")

model = AutoModel.from_pretrained("facebook/dpr-question_encoder-single-nq-base")`

singleheart commented 2 years ago

Thank you very much. It works!