bradyz / 2020_CARLA_challenge

"Learning by Cheating" (CoRL 2019) submission for the 2020 CARLA Challenge
181 stars 49 forks source link

Not able to train STAGE_1 #40

Closed petrovicu closed 3 years ago

petrovicu commented 3 years ago

Hi,

I have checked out the project master branch, run pre-trained models provided, and it worked as defined from your side. I am using CARLA version 0.9.10.1.

Then, I downloaded provided dataset to run a training, and got much worse results at the end. I used batch size 32 for both stages, tried both values for command coefficient (0.1 and 0.01), lr=0.0001, temp=10, sample_by=even, hack=True, and 50 (stage1) + 90 (stage2) epochs. Since I am using the latest CARLA version and the master repo is updated according to it, I assumed the problem occurred because the provided dataset is collected using older version (for example different classes in semantic map). Is this correct?

Having this in mind, I used a provided autopilot to collect the same amount of data from the latest version. Nevertheless, the training results were still bad. Since the whole process is time consuming I wrote an evaluation script for stage1 to check if it is working properly. It worked great with the checkpoints you provided (epoch=34.ckpt for both cc values), but it didn't with mine. It looks like even stage1 part introduces a problem, which eventually causes stage2 to work poorly. BTW, I also tested my stage1 checkpoints trained with your provided dataset with stage1 evaluation scripts and it also worked poorly.

Do you have any idea why is this happening?

Regards

bradyz commented 3 years ago

can you link me to some of your wandb runs from training? i want to see some of the visualizations of the model's predictions

petrovicu commented 3 years ago

Thanks for quick response!

Stage_1 training with original data provided from your side: URL link 1

Stage_1 training with data collected from CARLA 0.0.10.1: URL link 2

petrovicu commented 3 years ago

Hi @bradyz,

You gave me a good hint to check those visualizations during the training, and it looks like the target point is wrong, take a look at the positions of white dot (this is my train_image from wandb): image

Also, I noted that the gps sensor data values from CARLA 0.9.10.1 are different from those from previous versions, so for the same map and the same route (route_08.xml) within both CARLA versions I got:

# CARLA 0.9.9
gps = [48.99706601, 8.0028032]
...
mean = np.array([49.0, 8.0])
scale = np.array([111324.60662786, 73032.1570362])
gps_after_normalization_and_scaling = (gps - mean) * scale
...
gps_after_normalization_and_scaling = [-326.62542881, 204.72399902]

# CARLA 0.9.10.1
gps = [-0.0029339, 0.00183903]
...
mean = np.array([49.0, 8.0])
scale = np.array([111324.60662786, 73032.1570362])
gps_after_normalization_and_scaling = (gps - mean) * scale
...
gps_after_normalization_and_scaling = [-5455232.34057393, -584122.94774424]

It looks like new gps data are already normalized (but not quite as expected), so after I remove mean subtraction I have:

# CARLA 0.9.10.1
gps = [-0.0029339, 0.00183903]
scale = np.array([111324.60662786, 73032.1570362])
gps_after_scaling = gps * scale
# and I got:
gps_after_scaling =[-326.61526339, 134.30832775]

And btw, how did you get these exact values for mean and scale?

self.mean = np.array([49.0, 8.0])
self.scale = np.array([111324.60662786, 73032.1570362])
petrovicu commented 3 years ago

Hi @bradyz ,

The problem was that the gnss values from CARLA 0.9.10.1 are already normalized using OpenDrive geo-reference values (49.0, 8.0), so there is no need to do it again on your side. As a result, the scale factor should be: scale = np.array([111324.60662786, 111324.60662786]).

You can close this one.

bradyz commented 3 years ago

sorry for the slow response! thanks for figuring this one out - I'll need to make sure this bit isn't as hacky