Open bdzyubak opened 7 months ago
BERT and its DistilBERT (lean), and ROBERTA (more advanced) - flavors of the same model - have been implemented in the following branch https://github.com/bdzyubak/torch-control/issues/16. Llama is WIP.
Need to timebox the implementation of the other models - not all are available through transformers, and they may not have an easy interface.
Fine tune 3 (or more) popular models and compare performance to DistilBERT for the movie sentiment analysis task.
Some choices: GPT-3 LaMDA Turing-NLG XGen Llama 2 (7 billion) Gemini
Pick based on suitability for sentiment analysis task, popularity and affordability of tuning.