Migrate code to TensorFlow 2.16

eabase commented 7 months ago

I tried to do this, the code runs, but got blocked by the complexity. Something is not right and then it doesn't seem to learn, or extremely slow. (Or maybe because I'm running on CPU only?)

Issues:

Tensorflow 2 (for windows) doesn't run on CUDA. (idiots!)
gym is now gymnasium
Manipulating the screen and canvas in TF with Box2D and/or pygame is a real PITA.
Fixing the broken Progbar was a real PITA too, but I managed to replace it.
The new TF2 code are using 2 models: (self.model, and self.target_model)
Is this really what you wanted? You only had 1 in the original article?
Enabling TRAINING crashes
There seem to be some unused variables, such as agent.epsilon and s=s_ in the Episode loop.
(At least I see no change there, and it's zero.)
I had to rewrite the model definition to conform to new TF2 criteria.
I have not been able to fix the checkpoint deprecation issue, and have no idea how that works.

You can find my code in this Gist. It is meant to be run on Windows and from Powershell. Please feel free to include and modify.

In powershell, run it with:

$env:TF_CPP_MIN_LOG_LEVEL=1; $env:TF_ENABLE_ONEDNN_OPTS='0'; python.exe -qu -X utf8=1 .\lunar-lander.py

eabase commented 6 months ago

@svpino

Did you have a chance to look at this?

(Yeah, I know it's all exciting with GPT-4o, but we already determined it's not as exciting as we were sold.)

svpino commented 6 months ago

Ha ha! Yeah, I haven't had any time for this. The last time I touched this codebase was when I wrote it (and when I asked Devin to fix it.) I'm planning to record a video about this at some point, so I'll have to fix it, I guess.

eabase commented 6 months ago

I can only imagine. I looking into that as well, but I wanted this toy training model to work, and couldn't so it was a bit depressing when it didn't. The issue seem to be that I am not able to run it in training mode. (See my gist.) It just crashes due to some dimension error. Then, since it is not running on GPU, it seem very slow, so I can't really tell if it is learning. Another funny thing is that it seem to always start each loop by just firing the same engine, not other 2. Not sure how that behavior is randomized. (I have seen it fire 2 at some point, but it seem to come in randomly.)

PS. Regarding CUDA, I don't roll back, only forward!
I will see if I can get an updated CUDA compiled TF version going...

eabase commented 6 months ago

@svpino If you can have a look at my gist (linked above) I can create a PR.

AFAICT, it is not quite working as intended, and I don't know TF to fix it. Also, no idea why TRAINING mode doesn't work and complains about tensor dimension issue.

svpino / lunar-lander

Migrate code to TensorFlow 2.16 #2