VatsaDev / nanoChatGPT

nanogpt turned into a chat model
MIT License
61 stars 11 forks source link
chat finetuning gpt-2 llm ml

nanoChatGpt

nanoChatGPT

a barebones Nanogpt, but finetuned on conversational data

all updates in updates.md
Colab link
to view its capabilites, head to colab link, just run the git clone, pip install, and prepare.py, then run chat.py with --init_from=huggingface

Features

how does it work?

This is a fork of Nanogpt, but trained on the data format of a chatbot like chatgpt, with the format inspired by oasst-pythia-12b

<human> ... <endOfText>
<Bot> ... <endOfText>
<human> ... <endOfText>
<Bot> ... <endOfText>
<human> ... <endOfText>
<Bot> ... <endOfText>

Problems / TODOs




Anyone Who can contribute to the repo, please do so, any and all contributions are welcome, simply add a little to the dataset and expand it dataset would be amazing.

Limitations

I did not make the data dumps/corpuses that make up this data, and can't account for any biases, as the dataset it self is based off the conversations of real people who may or may not have had biases. The model is meant for academic research purposes, and isn't meant for any important or high risk scenarios. Do not follow its advice

whats in the data

for commercial purposes, just take the files input1.txt through input36.txt

citations

@misc{zheng2023judging,
      title={Judging LLM-as-a-judge with MT-Bench and Chatbot Arena}, 
      author={Lianmin Zheng and Wei-Lin Chiang and Ying Sheng and Siyuan Zhuang and Zhanghao Wu and Yonghao Zhuang and Zi Lin and Zhuohan Li and Dacheng Li and Eric. P Xing and Hao Zhang and Joseph E. Gonzalez and Ion Stoica},
      year={2023},
      eprint={2306.05685},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}