HarshayuGirase / Human-Path-Prediction

State-of-the-art methods for human trajectory forecasting. Contains code for papers published at ECCV 2020 and ICCV 2021.
MIT License
349 stars 81 forks source link

YNet: More details about the final training on ETH/UCY Dataset #35

Closed HRHLALALA closed 2 years ago

HRHLALALA commented 2 years ago

Hi, can you provide more experimental details about the final training of Y-Net on ETH/UCY Dataset to get your scores on the paper? Like #29 , I cannot obtain the results on the paper as well. Specifically, I have the following questions:

Or can you provide the pretrained weights and the pre-processed datasets?


HarshayuGirase commented 2 years ago


Hope this helps!

HRHLALALA commented 2 years ago

Hi Harshayu, thank you very much for your response. It is really helpful! For the first question, may I confirm whether you have deleted the following code during the last training?

# train.py

# TODO Delete
if dataset_name == 'eth':
    counter += batch_size
    # Break after certain number of batches to approximate evaluation, else one epoch takes really long
    if counter > 30: #TODO Delete
HarshayuGirase commented 2 years ago

Yes, I believe this is deleted during final experiments

HRHLALALA commented 2 years ago


Hope this helps!

Thanks for your updated reply! I have replaced all Convs with deformable convolution but the training time is really long (more than one hours per epoch). Note that I train the model on RTX3090 using the batch size of 8. Is this same for you during the final training? Just want to confirm all the configurations are same as yours.

Here is my implementation of DeformConv2d using torchvision.ops.DeformConv2d

from torchvision.ops import DeformConv2d as __DeformConv2d
import torch.nn as nn
class DeformConv2d(__DeformConv2d):
    def __init__(self, *args, **kwargs):
        super(DeformConv2d, self).__init__(*args, **kwargs)
        self.offset_conv = nn.Conv2d(
            out_channels = 2 * self.kernel_size[0] * self.kernel_size[1],
            kernel_size= self.kernel_size,
            bias=self.bias is not None

    def forward(self, x, mask=None):
        offset  = self.offset_conv(x)
        return super().forward(x, offset, mask)
HarshayuGirase commented 2 years ago


We used a 16GB V100 for training. I was able to find a copy of a training log (not sure if this is the final model we used since it doesn't have deformable conv parameters, will try to double check on this) but hopefully this should help:

{'ade_loss_lambda': 1, 'batch_size': 16, 'centroid': 'unweighted', 'decoder_channels': [64, 64, 64, 32, 32], 'encoder_channels': [32, 32, 64, 64, 64], 'est_samples': 500, 'kernlen': 31, 'learning_rate': 0.0005, 'loss_scale': 1000, 'name': 'oracle_medium2', 'nsig': 16.0, 'num_epochs': 200, 'obs_len': 8, 'pred_len': 12, 'rel_thresh': 0.0001, 'resize': 0.5, 'scene': 'zara1', 'skip_samples': 0, 'softargmax': 1, 'temp': 0.5, 'total_len': 20, 'viz_epoch': 10}

NociTUM commented 1 year ago


were you able to reproduce the results in the end? If so, would you mind sharing your notebook/ training file?

I did several training runs already with the whole dataset but cannot reproduce the ADE/FDE values reported on the ETH/UCY dataset.

Thanks in advance!

HRHLALALA commented 1 year ago

Unfortunately, the process is quite stochastic and time-consuming and I cannot reproduce the performance recorded in the paper with limited epochs. It may possible after training for a long time (e.g. few weeks) with good tuning hyperparameters. There is a reproduction report https://openreview.net/pdf?id=HV2zgpM7n0F.

NociTUM commented 1 year ago

Thank you very much for the insight, the report is quite helpful! One last question: Did you use your DeformConv2d implementation at the end to produce those numbers or did you stick to the original implementation? Along with that, do you still have the .pt-file with the network weights by any chance?

HRHLALALA commented 1 year ago

Thank you very much for the insight, the report is quite helpful! One last question: Did you use your DeformConv2d implementation at the end to produce those numbers or did you stick to the original implementation? Along with that, do you still have the .pt-file with the network weights by any chance?

Sorry for the late reply. Yes. But the DeformConv2d results in longer training. Unfortunately, none of my machines can reproduce ETH&UCY.

I have also tried to experiment with GoalSAR for some clues. Please see the issue https://github.com/luigifilippochiara/Goal-SAR/issues/2. This model actually uses the same structure as YNet except that the waypoints sampling is implemented using transformers and the data augmentations are more comprehensive. Hope this helps.