InhwanBae / LMTrajectory

Official Code for "Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction (CVPR 2024)"
https://ihbae.com/publication/lmtrajectory/
Other
68 stars 4 forks source link

ETH Performance in ETH/UCY #5

Open sskim0126 opened 2 weeks ago

sskim0126 commented 2 weeks ago

Hello, first of all, thank you for your code and work.

I have run the evaluation using your datasets and LMTraj-SUP weights, and the performance results are as follows:

minADE minFDE
ETH 0.5109 0.8710
Hotel 0.1212 0.1584
Univ Evaluation takes too long
Zara1 0.2019 0.3227
Zara2 0.1763 0.2763

The performance for the ETH dataset is worse than the ADE/FDE (0.4087 / 0.5011) presented on GitHub. Could you please help me understand why this might be the case?

InhwanBae commented 2 weeks ago

Hi @sskim0126,

Thank you for your interest in our work! Regrettably, after re-downloading and running all files from GitHub, I was able to replicate the results published in the paper. Could you provide some details about your evaluation system? Given that the performance is nearly the same across other scenes, your code setup seems to be fine.

For your reference, the performance metrics reported on GitHub were obtained using ./model/eval_accelerator_old.py with "per_device_inference_batch_size": 1024. After the publication, we have completely rewritten the code ./model/eval_accelerator.py to be up to 5 times faster. Although this version may introduce slight variations in performance, the overall differences should be minimal.

InhwanBae commented 2 weeks ago

Hi @sskim0126,

I hope all is well! I was wondering if you've had a chance to see if the issue has been resolved. Could you please share details of your evaluation system, including the type and number of GPUs, as well as the Python and CUDA versions you are using? Also, providing the bash output for both deterministic and stochastic would greatly assist me in replicating the problem.(Deterministic evaluation finishes within a few seconds!) Thank you!