Closed koi0823 closed 7 months ago
Oh dang, have I introduced a bug in 0.6.3?
The fact that it only captured two points is what causes the issue. Why did you close the issue, is this somehow fixed?
If not, are you using Windows or Linux please? And are you using version 0.6.3 of tmrl
?
the problem above is solve cus i just notice CMD run more useful then VSC firstly i run it on CMD
I'm using a window, and I believe there is a parameter issue since if I keep running, the resolution will change to 256 X 128; I'm not sure why.
i did fix the height and width
so this is my version yes is 0.6.3
I tested 0.6.3
for both --record-reward
and --check-environment
on Windows 11 and both worked as intended on my machine.
Your output for --record-reward
is extremely strange, it looks like the car instantly teleported to the finish line. The "initial number of captured positions" should never be 2.
(For the yellow warnings returned by --check-environment
, you can safely ignore them. We should fix this at some point but this is minor and somehow only seems to happen in --check-environment
for reasons that I don't understand.)
I'm utilizing the full environment to run it, so is it because the road isn't balanced that I can't record the award and check the environment?
i run it with the map spring 2024 - 01
maybe is the dent of the road that why the lidar cannot scan ?
let me try other map any map u recommended ?
i doing my FYP as a AI degree so i finding try to do it with a good result so i can present it
No you should be fine, this has nothing to do with the lidar. I recorded a reward on the 1st Spring 2024 track myself and it worked properly.
Something seems wrong with your OpenPlanet installation, as if you were receiving only 2 positions while recording the reward for some reason.
Maybe some external program is sending trash on port 9000?
To record the reward, you are supposed to set the car at the beginning of the track, press e, and then drive to the finish line. When you cross the finish line, you should see a message telling you that it has captured a few thousand initial positions (your problem is that it only captures 2 positions for some weird reason).
Yeah, but I have to leave for CMD.
Could you please tell me where the file or data went after the record reward so I can verify and see?
Could I please inquire what the record reward and check environment are? Because when I train, it's becoming worse.
Where is the file located so that I may check?
Could you please tell me the loss_actor? Has it been worse and worse since I'm not sure whether it's getting better at training?
INFO:root:=== epoch 42/10000 = round 76/100 ==================================== INFO:root:memory_len 293223 round_time 2.991769 idle_time 0.0 loss_actor -0.67442 loss_critic 0.524971 return_test 0.0 return_train 47.25 episode_length_test 0.0 episode_length_train 127.0 sampling_duration 0.006929 training_step_duration 0.008019
I'm training it in its full environment.Would it be preferable to use the entire environment or only lidar?
I'm new to this field, thus I'd like to know how to train for multiple sessions at once. I'm not sure how long to train for, and I scare cant make it for my FYP. nearing sep 2024
LIDAR is faster and easier to train, but it only works on plain grey road with black borders, like the tmrl-train
track. Full is more general and works on any track. It is however harder to find good hyperparameters with Full, and training takes a long time / requires a high-end GPU.
Everything is located in the TmrlData
folder, indluding the reward. However, this is a pickle file, so exploring this file will not help you debug, unless you unpickle and explore its content in a python script. But from your logs, what you would see if you were doing that would probably be a list of 3D points forming a straight line between your 2 initial captured positions.
What you can do for debugging is delete the TmrlData
folder entirely, execute python -m tmrl --install
(this will recreate the TmrlData
folder in its default state), and try the pre-trained AI on the tmrl-test
track using python -m tmrl --test
. If the AI works properly, OpenPlanet is sane and something is wrong in the way you recorded a custom reward.
(Note that, for the default AI to complete the tmrl-test
track, the camera needs to be in the exact configuration shown in the getting-started page, which may not be your default camera configuration)
Regarding the third column you said, I believe it should be alright as I ran it well and deleted it once beforehand. I used VSC to execute the Python -m tmrl --test works good, but the reward record appears to be stuck—possibly because I kept pushing E—so it isn't recording.
in other way i use cmd to test the python -m tmrl --record-reward is all fine
i running with GPU 3080 and my CPU is i5- 13600k
if can add my discord i might show u if u have sometime cus i new to this need some guide from u
discord -- koihaha#5605
idk is it okie or not but is training
Oh yes, you don't want to keep pushing e
, just press it once at the start of the track, then drive normally to the end, and when you cross the finish line the script should automatically compute the reward function from positions it captured in between.
Are you using the Full environment? If you are, it should automatically rescale the trackmania window to something smaller (unless you manually changed the corresponding parameters in config.json
).
If instead you are using the Lidar environment, you need to set the camera in 1st person view so that you don't see the car. You can use python -m tmrl --check-environment
and drive around to see whether the observations and rewards make sense before starting to train.
can i add why my memory_len is 1000000 and wont add anymore
INFO:root:Memory updated with steps:200, batch size:256, memory size:100000000. then this error when i add two more 0 behind the memory size
and ya i want to change my tmrl to another path file how ya ?
We could easily add an option for changing the TmrlData path but this is not done atm. If you want do do this, you will have to clone the repo, and manually change the value of TMRL_FOLDER here.
You can install your local version of the repo by cd-ing where the setup.py
file is and doing pip install -e .
hey i have a question for the time_step_timeout_factor cus i set it 80 but it will stop when is 48 second
Wow, don't do this. The timestep timeout factor should not exceed 2.0, otherwise it becomes entirely meaningless.
Closing as this is not a tmrl issue. Please open a thread in the discussions section for help and questions.
Wow, don't do this. The timestep timeout factor should not exceed 2.0, otherwise it becomes entirely meaningless.
Closing as this is not a tmrl issue. Please open a thread in the discussions section for help and questions.
u mean the timestep timeout factor should not exceed 2.0 keep respawn and respawn so i chatgpt it and get this ans
@yannbouteiller then how should i set my time and second cus is weird keep respawn then didnt stop
Respawns are not related to the timeout factor, they happen because the agent is failing to collect reward.
It probably fails to collect reward because your environment has an issue. You need to use python -m tmrl --check-environment
to find out what.
@yannbouteiller thx bro i will take a look
a very quick question can i change my memory_size around like 2000000
i checked my enivornment is ntg wrong but the time will stop at 50 second
https://youtu.be/2NkFNORkdD0?si=S3krB2NBnP2FN1P2 video on 2.10 time is the place i located and is 50 second
@yannbouteiller check this out i think my environment is fine but idk why is keep stuck at 50 second
i didnt change anything on the config
https://youtu.be/2NkFNORkdD0?si=S3krB2NBnP2FN1P2 video on 2.10 time is the place i located and is 50 second
@yannbouteiller check this out i think my environment is fine but idk why is keep stuck at 50 second
The 50 seconds limit is expected, it is the default time-limit in the example TrackMania pipeline. Your environment looks sane from your screenshots.
for the default time-limit in the example trackmania pipeline how can i change it or like fixed
You can change it in config.json
by changing the values of both the "ep_max_len" and the "RW_MAX_SAMPLES_PER_EPISODE" entries.
It's good that the time has changed, but how can I make the parameter more better? because it continues to tremble
1.is is my record_reward problem or not enought train ?
is the road is enought for the space to scan ?
The jitter is inherent to how Soft Actor-Critic trains policies. If you want to get rid of it, one solution is to penalize large changes in the steering in the reward function. This would require you to learn Python programming, clone the repository and adapt the environment' code.
Otherwise you can play with the SAC hyperparameters in config.json
, in particular the "alpha" term which is responsible for injecting entropy in the policy.
mean i need increase the value ? to like 0.05 or like less?
If you want to see less jitter, you should decrease it I believe, but this will also harm exploration.
can i ask what is loss_actor ?
You need to read the Soft Actor-Critic paper to understand this.
i have be try to record a reward on track
Later, when I attempted to run the check environment, all of the errors disappeared. I'm not sure whether this is a bug or what, but it kept running on the respawn location.