AnswerDotAI / RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Apache License 2.0
3.08k stars 210 forks source link

ValueError: Default process group has not been initialized, please make sure to call init_process_group. #246

Open Owe1n opened 2 months ago

Owe1n commented 2 months ago

I'm Running into this problem on a Mac M3 Sonoma 15.5 and python 3.10.14

When trying to execute the simple example from the documentation for finetuning a model :

from ragatouille import RAGTrainer
from ragatouille.utils import get_wikipedia_page
if __name__ == "__main__":
    pairs = [
        ("What is the meaning of life ?", "The meaning of life is 42"),
        ("What is Neural Search?", "Neural Search is a terms referring to a family of ..."),
        # You need many more pairs to train! Check the examples for more details!
        ...
    ]

    my_full_corpus = [get_wikipedia_page("Hayao_Miyazaki"), get_wikipedia_page("Studio_Ghibli")]

    trainer = RAGTrainer(model_name = "MyFineTunedColBERT",
            pretrained_model_name = "colbert-ir/colbertv2.0") # In this example, we run fine-tuning

    # This step handles all the data processing, check the examples for more details!
    trainer.prepare_training_data(raw_data=pairs,
                                    data_out_path="./data/",
                                    all_documents=my_full_corpus)

    trainer.train(batch_size=32) # Train with the default hyperparams

it's returning the following error :

    ValueError: Default process group has not been initialized, please make sure to call init_process_group.

Seems to be an error due to paralling training but nothing that I'm sure of.