Questions About Generating the .h5 File 关于.h5文件的一些小问题

yuxinl915 commented 2 years ago

Hi! Thanks so much for the author's great effort! However, when I obtained a .h5 file by running test_tnt.py and uploaded it to https://evalai.cloudcv.org/, it failed and had a stderr as follows. Not sure if anyone has encountered similar issues here :-(

Traceback (most recent call last): File "/code/scripts/workers/submission_worker.py", line 491, in run_submission submission_metadata=submission_serializer.data, File "/tmp/tmpalmlcho0/compute/challenge_data/challenge_454/main.py", line 109, in evaluate outputs[k] = eval_forecasting.compute_forecasting_metrics(output,gt,city_dict,k,30,2,prob) File "/code/argoverse-api/argoverse/evaluation/eval_forecasting.py", line 215, in compute_forecasting_metrics forecasted_probabilities, File "/code/argoverse-api/argoverse/evaluation/eval_forecasting.py", line 94, in get_displacement_errors_and_miss_rate max_num_traj = min(max_guesses, len(forecasted_trajectories[k])) KeyError: 11800

Basically, I only used the pre-trained TNT model and ran it on the interm_data_small (both from what the author has provided). Really appreciate any help you can provide!

您好！小新手一枚，有几个问题想请教一下大佬orz 我正在调试环境，用了您提供的interm_data_small中的数据，和pre-trained TNT model （就是在您提供的best_TNT.pth里的那个）。在本地终端没有报错，也成功得到了argoverse_forecasting_baseline.h5, 但是上传到https://evalai.cloudcv.org/ 之后，显示failed了。打开stderr得到以下错误：

Traceback (most recent call last): File "/code/scripts/workers/submission_worker.py", line 491, in run_submission submission_metadata=submission_serializer.data, File "/tmp/tmpalmlcho0/compute/challenge_data/challenge_454/main.py", line 109, in evaluate outputs[k] = eval_forecasting.compute_forecasting_metrics(output,gt,city_dict,k,30,2,prob) File "/code/argoverse-api/argoverse/evaluation/eval_forecasting.py", line 215, in compute_forecasting_metrics forecasted_probabilities, File "/code/argoverse-api/argoverse/evaluation/eval_forecasting.py", line 94, in get_displacement_errors_and_miss_rate max_num_traj = min(max_guesses, len(forecasted_trajectories[k])) KeyError: 11800

想请教下大佬问题出在了哪里，能否给我些建议。感恩！

Henry1iu commented 2 years ago

Hi,

谢谢你, 大佬我可不敢当, 有问题欢迎交流~

原因有两个: 1) 记得使用test数据集的结果上传他们的服务器. 2) 我分享的这个小数据集是不完整的, 仅供debug使用. 如果.h5中的sample不完整, 也会出现类似报错.

如果你想测试我提供的pre-trained model, 需要你自己生成完整的test set.

Best Regards, Jianbang

yuxinl915 commented 2 years ago

感谢您秒回！我再追问一个小问题hhh就是关于使用完整的数据集我目前想只使用完整的test（手头没有好的显卡进行完整的train太费时了）我想直接用pretrained tnt跑完整的test 但是preprocessing.bash好像没有单独生成test_intermediate的选项？我想问问单独生成test会不会很麻烦（i.e. 对码有大幅度的改动容易产生bug 等）。如果很麻烦的话，我就老老实实整个preprocess了hhh

On Thu, Sep 8, 2022 at 12:03 AM LIU Jianbang @.***> wrote:

Hi,

谢谢你, 大佬我可不敢当, 有问题欢迎交流~

原因有两个: 1) 记得使用test数据集的结果上传他们的服务器. 2) 我分享的这个小数据集是不完整的, 仅供debug使用. 如果.h5中的sample不完整, 也会出现类似报错.

如果你想测试我提供的pre-trained model, 需要你自己生成完整的test set.

Best Regards, Jianbang

— Reply to this email directly, view it on GitHub https://github.com/Henry1iu/TNT-Trajectory-Prediction/issues/30#issuecomment-1240190222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARSRTYZ6CGFGIPKASJPJCRTV5FQRBANCNFSM6AAAAAAQHKBEXE . You are receiving this because you authored the thread.Message ID: @.***>

Henry1iu commented 2 years ago

Hi,

可以直接修改代码

另外dataloader部分可以用类似方法修改, 只生成test set.

yuxinl915 commented 2 years ago

啊好的！谢谢！我尝试下

On Thu, Sep 8, 2022 at 12:44 AM LIU Jianbang @.***> wrote:

Hi,

可以直接修改代码 https://github.com/Henry1iu/TNT-Trajectory-Prediction/blob/d96d01c745279d56d43e128fff95469e0eeced5e/core/util/preprocessor/argoverse_preprocess_v2.py#L427

另外dataloader部分可以用类似方法修改, 只生成test set.

— Reply to this email directly, view it on GitHub https://github.com/Henry1iu/TNT-Trajectory-Prediction/issues/30#issuecomment-1240213178, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARSRTY6HZ4MZXYVKLZCPXDDV5FVLZANCNFSM6AAAAAAQHKBEXE . You are receiving this because you authored the thread.Message ID: @.***>

yuxinl915 commented 2 years ago

您好！您之前说的那个方法确实是可以的。但是我现在又遇到了一个新的问题，就是，Transforming the data to GraphData之后, 突然间显示killed, 文件夹test_result中创建了对应时间的文件夹，但是那个文件夹是空的，我猜测可能是有什么问题导致process中断了，但是没有得到任何的报错，在谷歌上也搜不到相关的帖子，所以想再请教一下。在terminal输入python test_tnt.py后输出如下（我检查了 -rm 和其它有关的command line arguments, 没有什么问题）.

Processing... Loading Raw Data...: 100%|██████████████| 78143/78143 [00:38<00:00, 2023.63it/s]

[Argoverse]: The maximum of valid length is 305. [Argoverse]: The maximum of no. of candidates is 4002. Transforming the data to GraphData...: 100%|█| 78143/78143 [13:59<00:00, 93.10it Killed

On Thu, Sep 8, 2022 at 12:44 AM LIU Jianbang @.***> wrote:

Hi,

可以直接修改代码 https://github.com/Henry1iu/TNT-Trajectory-Prediction/blob/d96d01c745279d56d43e128fff95469e0eeced5e/core/util/preprocessor/argoverse_preprocess_v2.py#L427

另外dataloader部分可以用类似方法修改, 只生成test set.

— Reply to this email directly, view it on GitHub https://github.com/Henry1iu/TNT-Trajectory-Prediction/issues/30#issuecomment-1240213178, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARSRTY6HZ4MZXYVKLZCPXDDV5FVLZANCNFSM6AAAAAAQHKBEXE . You are receiving this because you authored the thread.Message ID: @.***>

yuxinl915 commented 2 years ago

您好！我自己又探索了一下猜测是preprocessing的问题因为我的intem_data/test_intermediate文件夹下只有一个data.pt 不知道我之前提出最新的问题是不是由于preprocessing的一些问题引起的想请教您一下。另一个问题是您之前提到的dataloader的部分用类似方法修改我不是很理解（不确定是不是对dataloader的修改出了问题）您可以再说的更详细些嘛hhh 感激！

Henry1iu commented 2 years ago

Hi,

请问你的设备内存是多大的?

Henry1iu commented 2 years ago

您好！我自己又探索了一下猜测是preprocessing的问题因为我的intem_data/test_intermediate文件夹下只有一个data.pt 不知道我之前提出最新的问题是不是由于preprocessing的一些问题引起的想请教您一下。另一个问题是您之前提到的dataloader的部分用类似方法修改我不是很理解（不确定是不是对dataloader的修改出了问题）您可以再说的更详细些嘛hhh 感激！

我指出的那句代码意思是遍历三个split, 你可以将"train"和"val"删掉,只处理"test"

yuxinl915 commented 2 years ago

Hi,

请问你的设备内存是多大的?

可用内存是46G.

yuxinl915 commented 2 years ago

您好！我自己又探索了一下猜测是preprocessing的问题因为我的intem_data/test_intermediate文件夹下只有一个data.pt 不知道我之前提出最新的问题是不是由于preprocessing的一些问题引起的想请教您一下。另一个问题是您之前提到的dataloader的部分用类似方法修改我不是很理解（不确定是不是对dataloader的修改出了问题）您可以再说的更详细些嘛hhh 感激！

我指出的那句代码意思是遍历三个split, 你可以将"train"和"val"删掉,只处理"test"

是这样的这个我看懂了并且把您标注出的那个部分的代码改了但我看到您之后还有一句话说“另外dataloader部分可以用类似方法修改, 只生成test set.” 我不太确定您提到的dataloader部分是另外哪个部分

Henry1iu commented 2 years ago

Hi, 请问你的设备内存是多大的?

可用内存是46G.

我怀疑是内存溢出了, 可以通过增加swap来扩大可用内存, 建议memory + swap 总量至64G

Henry1iu commented 2 years ago

您好！我自己又探索了一下猜测是preprocessing的问题因为我的intem_data/test_intermediate文件夹下只有一个data.pt 不知道我之前提出最新的问题是不是由于preprocessing的一些问题引起的想请教您一下。另一个问题是您之前提到的dataloader的部分用类似方法修改我不是很理解（不确定是不是对dataloader的修改出了问题）您可以再说的更详细些嘛hhh 感激！

我指出的那句代码意思是遍历三个split, 你可以将"train"和"val"删掉,只处理"test"

是这样的这个我看懂了并且把您标注出的那个部分的代码改了但我看到您之后还有一句话说“另外dataloader部分可以用类似方法修改, 只生成test set.” 我不太确定您提到的dataloader部分是另外哪个部分

请见代码

yuxinl915 commented 2 years ago

感谢！这就去试试！

On Mon, Sep 19, 2022 at 3:53 AM LIU Jianbang @.***> wrote:

您好！我自己又探索了一下猜测是preprocessing的问题因为我的intem_data/test_intermediate文件夹下只有一个 data.pt 不知道我之前提出最新的问题是不是由于preprocessing的一些问题引起的想请教您一下。另一个问题是您之前提到的dataloader的部分用类似方法修改我不是很理解（不确定是不是对dataloader的修改出了问题）您可以再说的更详细些嘛hhh 感激！

我指出的那句代码意思是遍历三个split, 你可以将"train"和"val"删掉,只处理"test"

是这样的这个我看懂了并且把您标注出的那个部分的代码改了但我看到您之后还有一句话说“另外dataloader部分可以用类似方法修改, 只生成test set.” 我不太确定您提到的dataloader部分是另外哪个部分

请见代码 https://github.com/Henry1iu/TNT-Trajectory-Prediction/blob/9180637f81304c3d9aaf28ea88e572d779618e36/core/dataloader/argoverse_loader_v2.py#L389

— Reply to this email directly, view it on GitHub https://github.com/Henry1iu/TNT-Trajectory-Prediction/issues/30#issuecomment-1250686418, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARSRTY33AFX63T3OMLXILIDV7ALYHANCNFSM6AAAAAAQHKBEXE . You are receiving this because you authored the thread.Message ID: @.***>

Henry1iu / TNT-Trajectory-Prediction

Questions About Generating the .h5 File 关于.h5文件的一些小问题 #30