lando22 / GPT-RMT

An experiment to test Recurrent Memory Transformers in GPT
2 stars 1 forks source link

Hello, have you finished the test. #1

Open relic-yuexi opened 1 year ago

relic-yuexi commented 1 year ago

Hello, author. I'm glad to see your attempt, and I'm also curious about the method in arXiv:2304.11062. However, my abilities are limited, so I'm looking to see if anyone can apply the techniques in it to large-scale language dialogue models, such as Vicuna. I wonder if you have succeeded in doing so?

lando22 commented 1 year ago

Hi there! Still plugging away at making some improvements as it is a pretty tricky problem in the context of GPTs. I will be sharing the code soon but the results have been 50/50. Long story short, the RMT adds a lot of noise to the original GPT model.

More to come! Hang tight :)