SPARKNLP-1006: Introducing OLMo

Description

OLMo is a series of Open Language Models designed to enable the science of language models. The OLMo models are trained on the Dolma dataset. We release all code, checkpoints, logs (coming soon), and details involved in training these models.

Types of changes

[x] New feature (non-breaking change which adds functionality)

Checklist:

[x] My code follows the code style of this project.
[x] My change requires a change to the documentation.
[x] I have updated the documentation accordingly.
[x] I have read the CONTRIBUTING page.
[x] I have added tests to cover my changes.
[x] All new and existing tests passed.

JohnSnowLabs / spark-nlp

SPARKNLP-1006: Introducing OLMo #14242

Description

Types of changes

Checklist: