URL

https://arxiv.org/abs/2402.00838
Affiliations
- Dirk Groeneveld, N/A
- Iz Beltagy, N/A
- Pete Walsh, N/A
- Akshita Bhagia, N/A
- Rodney Kinney, N/A
- Oyvind Tafjord, N/A
- Ananya Harsh Jha, N/A
- Hamish Ivison, N/A
- Ian Magnusson, N/A
- Yizhong Wang, N/A
- Shane Arora, N/A
- David Atkinson, N/A
- Russell Authur, N/A
- Khyathi Raghavi Chandu, N/A
- Arman Cohan, N/A
- Jennifer Dumas, N/A
- Yanai Elazar, N/A
- Yuling Gu, N/A
- Jack Hessel, N/A
- Tushar Khot, N/A
- William Merrill, N/A
- Jacob Morrison, N/A
- Niklas Muennighoff, N/A
- Aakanksha Naik, N/A
- Crystal Nam, N/A
- Matthew E. Peters, N/A
- Valentina Pyatkin, N/A
- Abhilasha Ravichander, N/A
- Dustin Schwenk, N/A
- Saurabh Shah, N/A
- Will Smith, N/A
- Emma Strubell, N/A
- Nishant Subramani, N/A
- Mitchell Wortsman, N/A
- Pradeep Dasigi, N/A
- Nathan Lambert, N/A
- Kyle Richardson, N/A
- Luke Zettlemoyer, N/A
- Jesse Dodge, N/A
- Kyle Lo, N/A
- Luca Soldaini, N/A
- Noah A. Smith, N/A
- Hannaneh Hajishirzi, N/A
  Abstract
- Language models (LMs) have become ubiquitous in both NLP research and incommercial product offerings. As their commercial importance has surged, themost powerful models have become closed off, gated behind proprietaryinterfaces, with important details of their training data, architectures, anddevelopment undisclosed. Given the importance of these details inscientifically studying these models, including their biases and potentialrisks, we believe it is essential for the research community to have access topowerful, truly open LMs. To this end, this technical report details the firstrelease of OLMo, a state-of-the-art, truly Open Language Model and itsframework to build and study the science of language modeling. Unlike mostprior efforts that have only released model weights and inference code, werelease OLMo and the whole framework, including training data and training andevaluation code. We hope this release will empower and strengthen the openresearch community and inspire a new wave of innovation.
  Translation (by gpt-3.5-turbo)
言語モデル（LMs）は、NLPの研究および商業製品提供の両方で普及しています。その商業的重要性が高まるにつれて、最も強力なモデルは閉鎖され、独自のインターフェースの背後に閉じ込められ、トレーニングデータ、アーキテクチャ、開発の重要な詳細が非公開になっています。これらの詳細は、これらのモデルを科学的に研究する際に重要であり、そのバイアスや潜在的なリスクを含むことから、研究コミュニティが強力で本当にオープンなLMsにアクセスできることが不可欠だと考えています。このため、この技術レポートでは、最先端の本当にオープンな言語モデルであるOLMoの初回リリースと、言語モデリングの科学を構築し研究するためのフレームワークについて詳細に説明します。以前の取り組みがモデルの重みと推論コードのみを公開していたのとは異なり、私たちはOLMoとトレーニングデータ、トレーニングおよび評価コードを含むフレームワーク全体を公開します。このリリースがオープンな研究コミュニティを強化し、新しいイノベーションの波を起こすことを願っています。
Summary (by gpt-3.5-turbo)
LMsの商業的重要性が高まる中、最も強力なモデルは閉鎖されており、その詳細が非公開になっている。そのため、本技術レポートでは、本当にオープンな言語モデルであるOLMoの初回リリースと、言語モデリングの科学を構築し研究するためのフレームワークについて詳細に説明している。OLMoはモデルの重みだけでなく、トレーニングデータ、トレーニングおよび評価コードを含むフレームワーク全体を公開しており、オープンな研究コミュニティを強化し、新しいイノベーションを促進することを目指している。

AkihikoWatanabe / paper_notes

OLMo: Accelerating the Science of Language Models, Dirk Groeneveld+, N/A, arXiv'24 #1250

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)