JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.76k stars 703 forks source link

SPARKNLP-1006: Introducing OLMo #14242

Open prabod opened 2 months ago

prabod commented 2 months ago

Description

OLMo is a series of Open Language Models designed to enable the science of language models. The OLMo models are trained on the Dolma dataset. We release all code, checkpoints, logs (coming soon), and details involved in training these models.

Types of changes

Checklist: