yejimun / PaS_CrowdNav

Occlusion-Aware Crowd Navigation Using People as Sensors: ICRA2023
MIT License
45 stars 4 forks source link

Question about your loss function #4

Closed CAI23sbP closed 1 month ago

CAI23sbP commented 1 month ago

Long time no see, @yejimun . How have you been?

I have a question about your project. As you know, a Variational Autoencoder (VAE) typically uses a reconstruction loss, which can either be Mean Squared Error (MSE) or Cross Entropy. In my opinion, Cross Entropy seems more appropriate because MSE assumes a Gaussian distribution, while Cross Entropy assumes a Bernoulli distribution. Additionally, the Occupancy Grid Map has values like [0, 0.5, 1].

However, you used MSE as the reconstruction loss. Could you explain why you made that choice?

yejimun commented 1 month ago

I'm doing great, @CAI23sbP :)

Our sensor occupancy grid map (sensor grid) indeed consists of values like [0, 0.5, 1], but the estimated OGM (decoded) generated by the model produces values continuously between 0 and 1. While Cross Entropy could be used under the assumption of independent Bernoulli-distributed cells, our focus was on capturing the overall structural similarity between the ground truth OGM (label grid) and the estimated OGM (decoded), rather than strictly evaluating cell-wise accuracy. By using Mean Squared Error (MSE), we prioritize minimizing global structural differences, which aligns better with our objective of preserving spatial patterns across the entire grid.

CAI23sbP commented 1 month ago

@yejimun Thank you for your relply!