关于STAN模型运行一个epoch的时间问题

LibCity / Bigscity-LibCity

LibCity: An Open Library for Urban Spatial-temporal Data Mining

https://libcity.ai/

Apache License 2.0

886 stars 163 forks source link

关于STAN模型运行一个epoch的时间问题 #350

Closed KeFttan closed 1 year ago

KeFttan commented 1 year ago

当我用4090跑location prediction的STAN时，起初一个epoch需要一小时，当我修改了batch_size后时间并未缩短。我想请问一般跑STAN需要多长时间呢（一般100个epoch ？）

WenMellors commented 1 year ago

你好，这是正常的。轨迹预测任务，一般 10-30 epoch 就会拟合（早停）。

KeFttan commented 1 year ago

好的，非常感谢

KeFttan commented 1 year ago

还想请问一下，我看topk写的是5，假如我还想分别尝试top10和top20，是不是都分别需要再重新从头训练呢

WenMellors commented 1 year ago

不用，你可以指定增加命令行参数 --train false --exp_id 你之前训练的那次实验的 id 来加载之前训练的模型。这个 id 可以看下 cache 文件夹下存下来的模型是哪个 exp_id 的。具体实现细节在： https://github.com/LibCity/Bigscity-LibCity/blob/master/libcity/pipeline/pipeline.py#L51-L61

另外，现在 topk 可以设置成 [5, 10, 20] 一次评估多个了。

KeFttan commented 1 year ago

很有用，非常感谢

KeFttan commented 1 year ago

您好，我将topk设置为 [1,5,10,20]，并将precision加入metrics里：["Precision", "Recall", "F1", "MRR", "MAP", "NDCG"] 在跑next-location prediction中的LSTPM时，结果显示为： bdfbc7446f945b0e1eb408c5d2a5934

不清楚为何precision会越来越小，以及为何MAP和MRR的值一直是相等的？哪里设置的存在问题吗

WenMellors commented 1 year ago

您好，这是合理的。其本质原因是，轨迹下一跳预测（POI 推荐任务中），用户下一时刻前往的 POI 只有一个。不像推荐那边，给用户推一组 item，用户真实点击 item 可以有多个。所以其实一般不看 Precession 指标，看的比较多的是 Recall 和 NDCG。具体的，您可以看看文档评估指标的计算公式。

KeFttan commented 1 year ago

好的，感谢您的回复。

KeFttan commented 1 year ago

您好，我今天尝试跑下一跳预测中的LSTPM模型，发现一个epoch差不多接近一小时，Loss: 228.11140是否太大了，不知道是否合理。

WenMellors commented 1 year ago

时间是合理的，loss 不太记得了。

KeFttan commented 1 year ago

好的，感谢回复

KeFttan commented 1 year ago

您好，我想请问一下，假如训练到一半终止，我指定exp_id之后重新运行，是从之前训练到的状态继续训练吗，还是说训练是需要重新开始

WenMellors commented 1 year ago

exp_id 是配合 train 参数，用于前一轮已经跑完了，后面加载前一轮训练的模型直接评估的。所以你的情况应该还是重新开始

KeFttan commented 1 year ago

您好，我在看一些model的原文时留意到，他们有的模型会过滤掉访问较少的POI或用户，比如在LSTPM中少于10被访问的POI和轨迹小于5的用户会被删掉。想请问你们在将这些方法一起对比的时候，是统一了他们的过滤方式吗？

WenMellors commented 1 year ago

是的，我们在代码里也实现了这段筛选逻辑，受 min_checkins，min_sessions 这些参数控制。 https://github.com/LibCity/Bigscity-LibCity/blob/f8e5e49a162cb4c5660f655b43d9c0a919a21193/libcity/data/dataset/trajectory_dataset.py#L81-L115

相应文档里说明了这些参数 https://bigscity-libcity-docs.readthedocs.io/zh_CN/latest/user_guide/data/args_for_data.html#id3

KeFttan commented 1 year ago

您好，感谢之前的回复，其次有新的疑惑。由于在复现过程中，超参数是固定的，而默认的数据划分是7-2-1，此时验证集是不是没有意义，相当于用百分之70去训练，百分之10去测试，而百分之20用来调参的验证没有起到效果。。

WenMellors commented 1 year ago

可是超参数是用户可以自行调整的

KeFttan commented 1 year ago

嗯嗯，那假如我按照默认划分(7 1 2)去跑的话，是不是会出现7训练，2测试直接输出指标结果呀，1验证就会没有用到。（是不是将hyper_tune改写为true后才会用到那百分之10的验证集）

WenMellors commented 1 year ago

不是啊，验证集的作用是用于挑选设定实验参数下最优的 epoch。测试集是测试这个最优 epoch 下模型的性能

WenMellors commented 1 year ago

每一轮模型现在训练集上训练，然后在验证集上评估。根据验证集的表现 scheduler 优化器的学习率这些，然后最后选择验证集上最优的那一轮次，作为保存的模型并在测试集上评估。

KeFttan commented 1 year ago

原来是这个意思！感谢您的回复！