Open sanagno opened 1 year ago
Hi I am a ML student at Copenhagen University and it will be a good opportunity for me too try some of the theory that I have learned. Particularly in upcoming and temporary research obtained in my time in University. BH Abubakar
Feel free to join the discord and contact me from there :)
Sadly my GPU is not powerful enough, but this python script runs and should work as a backbone for the two requirements; model-vs-reward.txt
Thanks for the effort @Abubakar115e. The idea of this issue is to experiment with different prefixes. If you cannot run these models it would be difficult to proceed. Perhaps you can try using some quantization tricks. You should be able to run decent models for inference even on a regular GPU.
Yes I know that you can and use both the GPU (RTX 2070 max q) and RAM, but I only have one laptop and this work could be anywhere from a few hours to several days. I have corrected the code and should experiment with different prefixes and I have tried for 10 hours and it was still not done on my system. model-vs-reward.txt
Hi, I should be able to help with a AWS EC2 G5 instance with 24GB RAM. Is that enough? @sanagno How could I find you on Discord?
That's me: Sotiris#3996 :)
One of the baselines presented in the InstructGPT paper is a "properly" prompted GPT-3 model (see footnote 6 and section 3.5). Before the specified instruction/prompt by the user, a specified prefix is prepended to the user-specified instructions.
Requirements:
After some fine-tuning on the prefixes, compare the base model and the propmpted model based on the reward attained by the reward model.