Is this possible to produce text with few shot learning?

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

MIT License

35.94k stars 5.58k forks source link

Is this possible to produce text with few shot learning? #252

Open Yusuf-YENICERI opened 1 year ago

Yusuf-YENICERI commented 1 year ago

I trained a gpt model using this repo. I tried to produce text using few shot learning like the one below:

Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:

The result i get isn't something related. Does this repo enables that feature or is my model bad?

karpathy commented 1 year ago

At the scale of nanoGPT basically the answer is no. ICL (in context learning) emerges a few B parameters down the road.

Yusuf-YENICERI commented 1 year ago

Then may i ask if i would fine tune the gpt model i trained on a prompt-answer dataset, can i get a kind of ChatGPT like model? The reason i want is to have a model in my language answering questions on some of the domains i want.

Thanks for the reply.

C080 commented 1 year ago

Hi! Try loading gpt-XL weights and fine tune to your prompt-answer dataset, It should be able to produce your desired output

Yusuf-YENICERI commented 1 year ago

@C080 Gpt2-XL is trained for English language, but i want it it for my language which is Turkish. Wouldn't that be a problem? Or will that work but won't be performing enough?

C080 commented 1 year ago

It could pick up Turkish if it has been trained on a multi-lingual dataset with turkish inside! Anyway try using two layers of Google Translates after & before so all the reasoning happens in english!

VatsaDev commented 1 year ago

@Yusuf-YENICERI

Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:

This is totally possible if you scale this a lot, but there are much better models for this like bert finetuned or sentiment analysis, my repo uses a similar style, but for chat messages, like so:

<human> ... <endOfText>
<bot> ... <endOfText>