agrimgupta92 / sgan

Code for "Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks", Gupta et al, CVPR 2018
MIT License
826 stars 262 forks source link

Unable to reproduce the results with a new model #99

Open LogicStuff opened 3 years ago

LogicStuff commented 3 years ago

I am not able to train a similarly performing model after changing all MLP activations to LeakyReLU (because of the observed issue with unchanging loss).

At first, I used the hyperparameters from scripts/run_traj.sh and tried both pooling modules (although the default 2 m neighborhood probably is not sensible for social pooling), getting to validation ADE of ~0.7, and FDE of ~1.5 at epoch ~70 for both eth and zara1, shortly before the discriminator overpowered the generator and the whole model diverged.

image

While the selection mechanism of the best model looks to be robust against this convergence issue, I would have expected more thorough assurance of losses' trends. It is also pointless to train the model any further after substantial divergence.

I have since experienced with many hyperparameter settings and larger batch size (128 sequences per batch) seems to stabilize the process the most, but I still cannot get below the aforementioned evaluation metrics' values. Even if I manage to train the model for 200-500 epochs with stable GAN losses (and diminishing D_loss_real), the predictions greatly suffer from some kind of directional bias, i.e. trajectories of all pedestrians try to turn to the same heading, which remains constant for different inputs.

I have stopped worrying about ADE and FDE metrics under the setting of N > 1, because of its issues discussed here. Unfortunately, with N=1, the metrics are too noisy... Perhaps I will just take average instead of minimum in evaluate_helper of scripts/evaluate_model.py and also focus on collisions.

Update: I have also tried the (correct) hyperparameters extracted from the pretrained models via scripts/print_args.py. The vast majority of independent training runs also diverges within 50 epochs. Here, I have noticed that weakening down the discriminator by --d_type local helps stabilize the losses. Since that way the discriminator does not capture social interactions in any way, I am now also trying higher --g_steps/--d_steps ratios.

liuyu9661 commented 4 months ago

Hi, I have a similar problem. I try to reproduce the results by just running 'train.py' with no changes to the model. However, the ADE and FDE are very large. I will check my log file to see what happen.