Open svpino opened 7 months ago
@svpino
Did you have a chance to look at this?
(Yeah, I know it's all exciting with GPT-4o, but we already determined it's not as exciting as we were sold.)
Ha ha! Yeah, I haven't had any time for this. The last time I touched this codebase was when I wrote it (and when I asked Devin to fix it.) I'm planning to record a video about this at some point, so I'll have to fix it, I guess.
I can only imagine. I looking into that as well, but I wanted this toy training model to work, and couldn't so it was a bit depressing when it didn't. The issue seem to be that I am not able to run it in training mode. (See my gist.) It just crashes due to some dimension error. Then, since it is not running on GPU, it seem very slow, so I can't really tell if it is learning. Another funny thing is that it seem to always start each loop by just firing the same engine, not other 2. Not sure how that behavior is randomized. (I have seen it fire 2 at some point, but it seem to come in randomly.)
PS. Regarding CUDA, I don't roll back, only forward!
I will see if I can get an updated CUDA compiled TF version going...
@svpino If you can have a look at my gist (linked above) I can create a PR.
AFAICT, it is not quite working as intended, and I don't know TF to fix it. Also, no idea why TRAINING mode doesn't work and complains about tensor dimension issue.
I tried to do this, the code runs, but got blocked by the complexity. Something is not right and then it doesn't seem to learn, or extremely slow. (Or maybe because I'm running on CPU only?)
Issues:
gym
is nowgymnasium
Progbar
was a real PITA too, but I managed to replace it.self.model
, andself.target_model
)Is this really what you wanted? You only had 1 in the original article?
TRAINING
crashesagent.epsilon
ands=s_
in the Episode loop.(At least I see no change there, and it's zero.)
You can find my code in this Gist. It is meant to be run on Windows and from Powershell. Please feel free to include and modify.
In powershell, run it with: