Few-shot prefix a LLM to prompt it into an instruction following mode

LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

https://open-assistant.io

Apache License 2.0

36.98k stars 3.23k forks source link

Few-shot prefix a LLM to prompt it into an instruction following mode #341

Open sanagno opened 1 year ago

sanagno commented 1 year ago

One of the baselines presented in the InstructGPT paper is a "properly" prompted GPT-3 model (see footnote 6 and section 3.5). Before the specified instruction/prompt by the user, a specified prefix is prepended to the user-specified instructions.

Requirements:

A base model (eventually the same as the actual model we are going to be using).
A trained reward model.

After some fine-tuning on the prefixes, compare the base model and the propmpted model based on the reward attained by the reward model.

Abubakar115e commented 1 year ago

Hi I am a ML student at Copenhagen University and it will be a good opportunity for me too try some of the theory that I have learned. Particularly in upcoming and temporary research obtained in my time in University. BH Abubakar

sanagno commented 1 year ago

Feel free to join the discord and contact me from there :)

Abubakar115e commented 1 year ago

Sadly my GPU is not powerful enough, but this python script runs and should work as a backbone for the two requirements; model-vs-reward.txt

sanagno commented 1 year ago

Thanks for the effort @Abubakar115e. The idea of this issue is to experiment with different prefixes. If you cannot run these models it would be difficult to proceed. Perhaps you can try using some quantization tricks. You should be able to run decent models for inference even on a regular GPU.

Abubakar115e commented 1 year ago

Yes I know that you can and use both the GPU (RTX 2070 max q) and RAM, but I only have one laptop and this work could be anywhere from a few hours to several days. I have corrected the code and should experiment with different prefixes and I have tried for 10 hours and it was still not done on my system. model-vs-reward.txt

wangrui6 commented 1 year ago

Hi, I should be able to help with a AWS EC2 G5 instance with 24GB RAM. Is that enough? @sanagno How could I find you on Discord?

sanagno commented 1 year ago

That's me: Sotiris#3996 :)