Open Kim-mins opened 2 months ago
Hi, @Kim-mins!
Thank you for pointing out this issue. During the execution, some initial seeds may indeed trigger the error or result in nan values, which could be related to a calculation issue in the torch_geometric library. We are actively working on identifying and fixing the problem.
In the meantime, I suggest skipping the seeds that trigger these errors. Please feel free to reach out if you make any further discoveries or need additional assistance.
Thanks again!
Thank you for the response @AtongWang!
Good to hear that. I hope those errors to be resolved soon!
I also tried your suggestion, and I got two following questions:
Q1.
When using SEM improve-v3
on Town01, seems every 5 seed has dimension error or outputs nan as above.
So I tried improve-v2
, and the only working seed on Town01 is "2". Is it ok to run ScenarioFuzz with this single seed?
(Every seed is from scenario_lib
of the repository.)
Q2.
By the way, I have a question if current implementation does not support online learning.
When I read your paper, I thought the algorithm works with online learning(for SEM), but I could not find code for online learning from fuzzer_eval.py
. Is my understanding correct?
Thank you!
Hi @Kim-mins!
Thank you for your feedback!
For Q1, you're correct. This issue does exist, and it raises another point: during my testing, errors were more biased towards Town02-05, so it's possible that the usable seeds for Town01 are indeed fewer. In my SEM training, the data distribution from Town01 was minimal, which could be causing certain computational errors when invoking my SEM. This is one possibility. I recommend focusing on other maps for now. Alternatively, you could generate your own data for Town01 using fuzzer.py and train your own SEM.
As for Q2, we haven't explicitly written code for online training. It's incorporated into the method flow. Every time the test data accumulates to a multiple of 1K, we package that data and perform an update training for SEM. The SEM gets updated accordingly, which is why you'll see versions like improve-v0, v1, v2, etc.
Thank you for your patience!
Thank you for the kind and detailed response @AtongWang!
So maybe you mean the testing process pauses when you get 1K of data, and you manually train SEM with the collected data, and the SEM is used for the testing and the next 1K of data. Is my understanding correct?
Also, I wonder if the error could be resolved if I train SEM myself. Maybe you are also suffering the error.. so I can not sure for now.
Thanks a lot!
Hi @Kim-mins,
Yes, your understanding is correct. The testing process pauses when 1K data points are accumulated, and then SEM is manually trained with the collected data, which is used for testing the next batch.
As for the error, it’s worth trying to train SEM yourself—it might help resolve the issue. We're indeed working on addressing this problem, and I appreciate your understanding and efforts to help improve it. Let's work together and keep each other updated on any progress!
Thanks again for your support!
Hi, @AtongWang!
I read your paper(Thank you for the nice work!) and I'm trying to run ScenarioFuzz on my local machine. However, I'm having trouble on running the code myself. Could you please help me?
Details
Here are some details: When I tried to run the code with the seeds given from the repository on
Town01
(of carla), I encountered the error below, and it seems the error is fromtorch_geometric
library: Error message: (from the seed==========USING SCENARIO SEED:Town01 - t_intersection - 0==========
)Also, sometimes I could get
nan
from SEM, and it makes a current cycle skip, so I cannot run the fuzzer anyway. (nan
is from the seed==========USING SCENARIO SEED:Town01 - t_intersection - 2==========
)Environments
I've been running the code with python 3.7 with cuda 11.6, and set every library (e.g., torch, torch_geometric, carla python api, ...) following
requirements.txt
, but I could not resolve it.Arguments
Here's my arguments for running ScenarioFuzz:
I've also tried every kinds of
--eval-model
anddevice
, but the error still remains.Thank you in advance!