ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 91 forks source link

Create Google--GcpGPT #107

Open NeelamMandavia opened 9 months ago

NeelamMandavia commented 9 months ago

Creating a Google GCPGCP based on the model google/flan-t5-base in huggingface.co f you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract :

Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card.

NeelamMandavia commented 9 months ago

google GPT