AnswerDotAI / RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Apache License 2.0
2.99k stars 204 forks source link

trainer.train returns None #185

Open salbatarni opened 6 months ago

salbatarni commented 6 months ago

Hello there,

In the source code, trainer.train is supposed to return the model path, however, it is returning None for me... is there a way to save the trained model in a specific path? or not saving it at all and use it as a variable?

I have not changed anything in the example code

model_path = trainer.train(batch_size=32,
              nbits=4, # How many bits will the trained model use when compressing indexes
              maxsteps=500000, # Maximum steps hard stop
              use_ib_negatives=True, # Use in-batch negative to calculate loss
              dim=128, # How many dimensions per embedding. 128 is the default and works well.
              learning_rate=5e-6, # Learning rate, small values ([3e-6,3e-5] work best if the base model is BERT-like, 5e-6 is often the sweet spot)
              doc_maxlen=256, # Maximum document length. Because of how ColBERT works, smaller chunks (128-256) work very well.
              use_relu=False, # Disable ReLU -- doesn't improve performance
              warmup_steps="auto", # Defaults to 10%
             )
bclavie commented 6 months ago

Hey, it returning None is a quirk/bug of the upstream ColBERT implementation. The model is by default saved to disk and there's currently no way to disable this behaviour (though it'd be a good change in the future!).

Passing a custom path is also slightly complex, you'd need to update both the root_path and experiments_path in the ColBERT config, which is again something that could be simplified, either here or upstream.

Marking this as a good first issue and leaving it open as those would be welcome QoL changes, especially as I'm starting to work on improving training!