Open QuintinPope opened 1 year ago
I do not know of any "long" videos, apart from the ones available on the OpenAI blog post and Karolis's analysis on things.
I have done a few multi hour survival recordings, including several 12+ hour marathons in which neither death nor success would free the agent from the world.
While I lost my longest attempt due to an unfortunate windows update corrupting the video I am converting and uploading what I have now. (might be a few hours, these are a little hefty, and my internet isn't top notch.)
Sorry, I got distracted after recording by the release of the datasets/other scripts for BASALT.
Thanks so much!
Do these include the foundation / early game model? I'm curious whether some of the pathologies of the diamond getter (like running into lava) were caused by the RL training.
Here is a link to the playlist of what I have uploaded so far. (have found more that didn't save properly,) will re-do and upload as soon as I can.
https://youtube.com/playlist?list=PLWEbxlPoRo0MQCkHee6E1rfknltcKZhv0
There is 12 hours of the BC early game, which does avoid lava as I recall, the RL portion did indeed remove any fear of the lava.
I'm curious about how well they act generally over a long time window. GPT-3 was much better than the metrics suggested, simply by virtue of its flexibility during direct interactions. Are there any videos I can watch to see how "generally good" these models are?
Thanks!