AkihikoWatanabe commented 1 year ago

URL

https://arxiv.org/abs/2306.11644
Affiliations
- Suriya Gunasekar, N/A
- Yi Zhang, N/A
- Jyoti Aneja, N/A
- Caio César Teodoro Mendes, N/A
- Allie Del Giorno, N/A
- Sivakanth Gopi, N/A
- Mojan Javaheripi, N/A
- Piero Kauffmann, N/A
- Gustavo de Rosa, N/A
- Olli Saarikivi, N/A
- Adil Salim, N/A
- Shital Shah, N/A
- Harkirat Singh Behl, N/A
- Xin Wang, N/A
- Sébastien Bubeck, N/A
- Ronen Eldan, N/A
- Adam Tauman Kalai, N/A
- Yin Tat Lee, N/A
- Yuanzhi Li, N/A
  Abstract
- We introduce phi-1, a new large language model for code, with significantlysmaller size than competing models: phi-1 is a Transformer-based model with1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbookquality" data from the web (6B tokens) and synthetically generated textbooksand exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attainspass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displayssurprising emergent properties compared to phi-1-base, our model before ourfinetuning stage on a dataset of coding exercises, and phi-1-small, a smallermodel with 350M parameters trained with the same pipeline as phi-1 that stillachieves 45% on HumanEval.
  Translation (by gpt-3.5-turbo)
本研究では、競合するモデルよりもはるかに小さいサイズのphi-1という新しいコード用大規模言語モデルを紹介します。 phi-1は、1.3Bのパラメータを持つTransformerベースのモデルで、ウェブ上の「教科書の品質」のデータ（6Bトークン）とGPT-3.5で合成された教科書と演習問題（1Bトークン）の選択肢を使用して、8つのA100で4日間トレーニングされました。この小規模なモデルであるにもかかわらず、phi-1はHumanEvalでpass@1の正解率50.6％、MBPPで55.5％を達成しています。また、phi-1-base（コーディング演習のデータセットでの微調整段階前のモデル）やphi-1-small（同じパイプラインでトレーニングされた350Mのパラメータを持つ小さなモデル）と比較して、驚くべき新しい性質を示しています。phi-1-smallは、HumanEvalで45％を達成しています。
Summary (by gpt-3.5-turbo)
本研究では、小規模なphi-1という新しいコード用大規模言語モデルを紹介し、8つのA100で4日間トレーニングした結果、HumanEvalでpass@1の正解率50.6％、MBPPで55.5％を達成したことを報告しています。また、phi-1は、phi-1-baseやphi-1-smallと比較して、驚くべき新しい性質を示しています。phi-1-smallは、HumanEvalで45％を達成しています。

AkihikoWatanabe commented 1 year ago

参考: https://twitter.com/hillbig/status/1671643297616654342?s=46&t=JYDYid2m0v7vYaL7jhZYjQ

AkihikoWatanabe commented 1 year ago

AkihikoWatanabe / paper_notes

Textbooks Are All You Need, Suriya Gunasekar+, N/A, arXiv'23 #766

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)