Round number/ correct round time

NumseBacon commented 3 years ago

Hello!

So on the wandb.ai website it says my round_time is around 200, but in reality the round times are around 20-23 while it trains etc. Also would it be possible to see how many times the ai has resat on a map/how many times it has driven it?s

edigeze commented 3 years ago

Hello NumseBacon, So if you want to see how many resets the ai did, it's corresponding to the horizontal axis (step). If you want to see how far the car went, you can check that at return_train (10 000 = 1Km). 20-23 is corresponding to the amount of epoch, not the round_time. The round_time is the time spent on each round. Good luck with your training ;)

NumseBacon commented 3 years ago

No no I meant 20-23 as in the AI drives from start to finish of map in 20-23 seconds. Also you say round_time is time spent on each round, but for me it increases?

NumseBacon commented 3 years ago

Also I wanna ask about this: Ive been training for about 10-12 hours and it keeps wiggling a lot on the track, and hitting walls you can see it in this video: https://youtu.be/2z5FD-38Pwk It has only gained a little time on the track. The 17,945 time you see on the ranking tab is by me where I didnt drive it as fast as I could. I would imagine it could be faster if it could stop hitting the walls? Should I reset the data completely or just keep training?

I also noticed when I opened OBS to record the video I got more of the "Time-step timed out" errors but It drove faster than when training?

yannbouteiller commented 3 years ago

Hi!

The "round time" is not the duration that the AI took to complete the track, that would rather be the "episode length" (in number of time-steps, which we default to 0.05s each). Instead, we arbitrarily call a "round" a fixed number of training iterations between which training results are printed/sent to wandb.

To give you an idea, with the RTX 3080 that we use for training, a round is indeed about 20-30s, if your round time is 200s the default hyperparameters are likely not fit for your GPU, and training will likely be ~10 times longer.

I also see in your video that your track has a small slope at the beginning. This might disrupt SAC with the default history of 4 LIDAR measurements equally spaced in time that I suppose you use as your observation space, because of the possible non-markovness of this part of the environment. Keep in mind that RL expects to see a similar behavior of the environment for similar observation-action pairs :)

Edit: are you using the provided pretrained network? It works for the provided example track but I am not sure it is a good idea to start from it to train a policy on another track, you may actually be better starting training from scratch when using your own track.

NumseBacon commented 3 years ago

When I first installed it I dragged the pretrained network into the weights folder, and began training my network. And Ive been letting it run like that, idk if it then uses something from the pretrained network?

When you say "episode length" for time to finish the track, do you mean episode_length_test or episode_length_train, I would imagine train cuz thats probably the one from when its trains the network right?

How would I reset my data/start from scratch?

Is this the network, or are you manually driving?: https://github.com/trackmania-rl/tmrl/blob/master/readme/img/tm_annimation.gif

yannbouteiller commented 3 years ago

The GIF is a placeholder at the moment, it is @edigeze driving. We should replace this with an actual footage of the provided AI now that we have open-sourced the repo.

episode_length_test and episode_length_train are more or less the same things: train is while the AI trains, and test is because once in a while the AI runs a "test" episode where it takes deterministic actions instead of stochastic actions, in some environments this is known to yield a marginal improvement but in TMRL I don't think there is any noticeable such improvement, it is rather for debugging purpose.

To restart from scratch, you have two possibilities. Currently, the "clean" way is to go to tmrl/tmrl/config/config_constants.py and change the value of the RUN_NAME constant. The other solution is to go to tmrl/tmrl/data/checkpoint and delete the .pkl object (this is a saved state of your training), then go to tmrl/tmrl/data/weights/ and remove all .pkl objects (these are snapshots of your trained networks).

NumseBacon commented 3 years ago

Alright nice! Kinda sucks you have to have your computer on all the time haha

yannbouteiller commented 3 years ago

Difficult to run the game with the computer turned off, but you can stop training and restart later, it will work.

NumseBacon commented 3 years ago

@yannbouteiller Hello... Im now getting the error: "The virtual device could not connect to ViGEmBus." AssertionError: The virtual device could not connect to ViGEmBus.

Also It is installed but idk

Just realised I pasted the wrong thing lol

yannbouteiller commented 3 years ago

Hi, this is pretty weird. This is the vgamepad library that fails to connect to vigembus (nefarius solutions) for some reason. Something has changed in your configuration I guess because it use to work and now it doesn't? Do you have more info?

Normally when you install vgamepad (this is a dependancy of tmrl), you are prompted to install vigembus (nefarius solutions), are you sure it is installed correctly ? Something you can try is uninstall manually vigembus, then pip uninstall vgamepad, and then pip install vgamepad again

NumseBacon commented 3 years ago

Ok so I fixed it by installing an earlier version of vigembus. Now everything "works". The commands run and everthing but the car doesnt drive. In the settings its set to keyboard. It spams a lot of "Time-step timed out" in worker console

yannbouteiller commented 3 years ago

The "Time-step timed out" mean that your PC is not powerful enough to run the AI an trackmania in parallel. What you can try is to reduce the graphics in trackmania to the minimum, and in particular set a maximum frame rate of 30 fps. To test whether the algo works, you should use the provided map and the provided neural network, the car should be driving quite well.

You should see roughly the same as what you see at the beginning of this video : https://www.youtube.com/watch?v=LN29DDlHp1U

(@edigeze , I just made the video public)

NumseBacon commented 3 years ago

The game is running at 150 fps :/

yannbouteiller commented 3 years ago

This is why you should reduce it to 30 fps, there is an option for that in the graphics setting of TM2020. Running the actor network consumes quite a bit of CPU, and if trackmania eats too much of it there is not enough left for inference. "Time-step timed out" means that your PC is taking too much time to take a screenshot, compute the lidar observation, and feed it to the neural network. If this happens only once in a while you won't notice, but if it happens repeatedly this will break the AI.

NumseBacon commented 3 years ago

I capped maximum fps at 30 fps and it still doesnt drive

NumseBacon commented 3 years ago

I feel like its because im using an earlier version of vigembus

yannbouteiller commented 3 years ago

If so, just uninstall vigembus and vgamepad, and then reinstall vgamepad, it will automatically install the right version of vigembus.

yannbouteiller commented 3 years ago

I think the new version of trackmania is very CPU-hungry. I have tested tmrl yesterday and I had to cap trackmania to 30fps and put graphics to low to make it work, while I use to run this at max quality with no issue. I'm very busy these days but I'll do some further testing when I have a moment.

NumseBacon commented 3 years ago

Well if I get the newer version of vigembus it gives me the error so it doesnt work either

yannbouteiller commented 3 years ago

You should not install vigembus from the official website, the right version is packaged in vgamepad and is installed automatically when installing vgamepad.

NumseBacon commented 3 years ago

Yea ik that one doesnt work for me

yannbouteiller commented 3 years ago

Really? That is unheard of, what version of vigembus were you using in this video you showed earlier? vgamepad seemed to work correctly at this point. If you have more info regarding why it doesn't work please share.

NumseBacon commented 3 years ago

116.116 from their github is the only that's kinda working but doesn't drive. Earlier was just vgamepad from modules

NumseBacon commented 3 years ago

https://youtu.be/AmWAJ6g2HAo

yannbouteiller commented 3 years ago

Oh wow, something is very wrong! You don't show the CPU stats but if the GPU is a 3090 I expect your overall PC config to be much faster than what I use here, you should definitely not get those timestep timeouts even when running TM2020 at full speed.

So, I see two issues here.

First, vgamepad doesn't work correctly on your PC for some reason (I think even when it seems to work actually it does nothing, or it does something unexpected because I see your mouse pointer flashing in a weird way). Can you try the following in this order:

1) pip install -U pip 2) pip uninstall vgamepad 3) Manually uninstall all versions of vigembus (Nefarius Virtual Gamepad Emulation Driver) 4) pip install vgamepad --no-cache-dir 5) you should get this pop-up: https://paste.pics/0a1687ffdc31cee1420f68f4b5cbd4f6 6) accept the terms, install, allow the msi file to do the install, finish installation 7) go to gamepad tester: https://gamepad-tester.com 8) execute the following python script and check whether the buttons, trigger and joystick values are set accordingly in gamepad tester:

import vgamepad as vg
import time

gamepad = vg.VX360Gamepad()

# press a button to wake the device up
gamepad.press_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_A)
gamepad.update()
time.sleep(0.5)
gamepad.release_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_A)
gamepad.update()
time.sleep(0.5)

# press buttons and things
gamepad.press_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_A)
gamepad.press_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_LEFT_SHOULDER)
gamepad.press_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_DPAD_DOWN)
gamepad.press_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_DPAD_LEFT)
gamepad.left_trigger_float(value_float=0.5)
gamepad.right_trigger_float(value_float=0.5)
gamepad.left_joystick_float(x_value_float=0.0, y_value_float=0.2)
gamepad.right_joystick_float(x_value_float=-1.0, y_value_float=1.0)

gamepad.update()

time.sleep(1.0)

# release buttons and things
gamepad.release_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_A)
gamepad.release_button(button=vg.XUSB_BUTTON.XUSB_GAMEPAD_DPAD_LEFT)
gamepad.right_trigger_float(value_float=0.0)
gamepad.right_joystick_float(x_value_float=0.0, y_value_float=0.0)

gamepad.update()

time.sleep(1.0)

# reset gamepad to default state
gamepad.reset()

gamepad.update()

time.sleep(1.0)

Second, there appears to be something on you PC that keeps tmrl from running fast: if your CPU is reasonable for a RTX3090 PC you should not get any of these timestep timeouts.

Can you

1) check whether your CPU (not GPU) is indeed saturated (and why) 2) keep the TM2020 window open and run the benchmarking script: python tools\benchmark_environment.py , this should take 1-2 minutes before printing the results (you can move around with your keyboard arrows in the meantime to check whether your reward is working)

NumseBacon commented 3 years ago

well....

yannbouteiller commented 3 years ago

According to this thread this might be a Windows Update issue, but I don't have more info. I can try to update vgamepad with a more recent version of vigembus when I find some time for this, but I cannot reproduce the issue on my laptop so it is hard to see where that comes from.

yannbouteiller commented 3 years ago

What I don't understand is that, in your first video, vgamepad seemed to work fine, what changed since then?

NumseBacon commented 3 years ago

As I said I'm using another computer

yannbouteiller commented 3 years ago

Can you tell what difference is likely to cause the issue ? e.g. difference of Windows versions, or python versions, or python 32 bits vs python 64 bits...

NumseBacon commented 3 years ago

I dont think python version /bit should make the vigembus not work. The version is server 2019 but on vigembus github it says server 2019 is supported

yannbouteiller commented 3 years ago

I was saying this in the unlikely-but-not-impossible case that the issue would come from how C-Python bindings are done in vgamepad: the exception you get doesn't come from vigembus directly, it comes from vgamepad complaining that it failed to retrieve a valid pointer from vigembus. I don't think anyone has tried vgamepad with Windows Server before, possibly the way I compiled the vigembus-client shared library is compatible with Windows 10 only, idk...

Actually that is probably the issue.

NumseBacon commented 3 years ago

Is there anyway for you to fix that or? Idk if I can get a windows 10 except my own computer

yannbouteiller commented 3 years ago

I can try to recompile the shared library for Windows Server if this option exists in VisualStudio yes, but I am not able to do this in the next few days. In the meantime what you can do is manually set this line to False, this should deactivate the gamepad and use the keyboard instead, but the pretrained neural network won't work because it has been trained with the gamepad, so you should delete the weights and checkpoints in the data folder so training restarts from scratch.

NumseBacon commented 3 years ago

aight

yannbouteiller commented 3 years ago

By the way another solution for restarting training from scratch at the moment is to manually change the value of this constant

NumseBacon commented 3 years ago

Now this happens :/

yannbouteiller commented 3 years ago

This is because you did not install tmrl correctly and did not record a reward.

To install tmrl correctly you should clone the repo, go where the setup.py is and execute:

pip install -e .

(the -e option is important)

To record a reward you should follow these instructions

NumseBacon commented 3 years ago

Seems I got it working.

yannbouteiller commented 3 years ago

Hi @NumseBacon, there was no option to compile for Windows Server specifically in Visual Studio (only x64 and x86 architectures), but I still have updated the vigem-client DLL files in the vgamepad repo.

I don't expect this to change much but who knows... If you care to try, you can clone the vgamepad repo and install it with pip install . or pip install -e ., if that miraculously works on Windows Server I will update the the PyPI package.

NumseBacon commented 3 years ago

Sadly it didnt

yannbouteiller commented 3 years ago

This seems related to https://github.com/ViGEm/ViGEmBus/issues/85

(That would be the mandatory X360 driver missing from Server 2019, Nefarius gives a link to this driver in the referenced issue)

NumseBacon commented 3 years ago

Oh nice. I will test sometime

yannbouteiller commented 3 years ago

If you try please tell me if it works, so I add this in the vgamepad repo

yannbouteiller commented 2 years ago

@NumseBacon the repo has been on standby for a while because I had other projects, but I have just found a bug that has probably hurted your training performance badly (sample time in particular). It was lurking here since May... I will soon push the fix to master.

NumseBacon commented 2 years ago

@yannbouteiller Hey! It seems like I can use tmrl on another computer and see data etc on another computer? How do I do that? I have my extra computer right next to me

yannbouteiller commented 2 years ago

Yep, we do this all the time and it it made easy in the new version (in fact I think the new version works only in this case, I have to fix that, will probably do today). If you are on a local network, you can change the config.json file located in your home folder (TmrlData/config) for this. You just set the public ip of the server to the local IP of the computer where the server is, and set localhost server and/or localhost worker to false (you can safely set them both to false)

NumseBacon commented 2 years ago

Lets call my current computer pc1 and the one i want to use tmrl on pc2. Is the server config on pc1 or 2? and then is it possible to close pc1 and just keep the pc2 running?

yannbouteiller commented 2 years ago

For seeing the data, we use wandb. You can see the data online on the public wandb project that we provide, or you can use your own wandb account by modifying the wandb credentials in config.json

NumseBacon commented 2 years ago

and the rest?

trackmania-rl / tmrl

Round number/ correct round time #4