kuandeng / LightGCN

479 stars 127 forks source link

low speed again after setup #10

Open ucasiggcas opened 4 years ago

ucasiggcas commented 4 years ago

hi,dear before the cmd, $ python setup.py build_ext --inplace each epoch costs about 2 mins, as follows, amazon-book

Epoch 1 [134.4s]: train==[0.46859=0.46839 + 0.00020]
Epoch 2 [113.2s]: train==[0.25395=0.25336 + 0.00058]
Epoch 3 [113.6s]: train==[0.21310=0.21233 + 0.00076]
Epoch 4 [113.6s]: train==[0.19080=0.18992 + 0.00088]
Epoch 5 [113.1s]: train==[0.17508=0.17410 + 0.00098]
Epoch 6 [113.4s]: train==[0.16173=0.16066 + 0.00108]
Epoch 7 [114.5s]: train==[0.15014=0.14898 + 0.00117]
Epoch 8 [113.6s]: train==[0.13942=0.13816 + 0.00126]
Epoch 9 [112.6s]: train==[0.12980=0.12846 + 0.00135]
Epoch 10 [113.5s]: train==[0.12218=0.12074 + 0.00144]
Epoch 11 [113.2s]: train==[0.11550=0.11397 + 0.00153]
Epoch 12 [113.4s]: train==[0.10952=0.10791 + 0.00161]

but when I exe the cmd, the results are the same

Epoch 1 [133.9s]: train==[0.46936=0.46916 + 0.00020]
Epoch 2 [113.8s]: train==[0.25707=0.25650 + 0.00058]
Epoch 3 [113.5s]: train==[0.21627=0.21551 + 0.00075]

the version

Cython               0.29.15
scikit-learn         0.22.1             
scipy                1.3.1 
tensorflow           1.15.0 
numpy                1.17.3  

so, is the version's question ??

thx

hexiangnan commented 4 years ago

I don’t understand what’s your question.. Your results look normal.

在 2020年9月5日,下午3:04,VideoRecSys notifications@github.com 写道:

 hi,dear before the cmd, $ python setup.py build_ext --inplace each epoch costs about 2 mins, as follows, amazon-book

Epoch 1 [134.4s]: train==[0.46859=0.46839 + 0.00020] Epoch 2 [113.2s]: train==[0.25395=0.25336 + 0.00058] Epoch 3 [113.6s]: train==[0.21310=0.21233 + 0.00076] Epoch 4 [113.6s]: train==[0.19080=0.18992 + 0.00088] Epoch 5 [113.1s]: train==[0.17508=0.17410 + 0.00098] Epoch 6 [113.4s]: train==[0.16173=0.16066 + 0.00108] Epoch 7 [114.5s]: train==[0.15014=0.14898 + 0.00117] Epoch 8 [113.6s]: train==[0.13942=0.13816 + 0.00126] Epoch 9 [112.6s]: train==[0.12980=0.12846 + 0.00135] Epoch 10 [113.5s]: train==[0.12218=0.12074 + 0.00144] Epoch 11 [113.2s]: train==[0.11550=0.11397 + 0.00153] Epoch 12 [113.4s]: train==[0.10952=0.10791 + 0.00161] but when I exe the cmd, the results are the same

Epoch 1 [133.9s]: train==[0.46936=0.46916 + 0.00020] Epoch 2 [113.8s]: train==[0.25707=0.25650 + 0.00058] Epoch 3 [113.5s]: train==[0.21627=0.21551 + 0.00075] the version

Cython 0.29.15 scikit-learn 0.22.1
scipy 1.3.1 tensorflow 1.15.0 numpy 1.17.3
so, is the version's question ??

thx

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas commented 4 years ago

意思是,用不用编译,速度并没有加快, image 看起来是你们时间的两倍 image

ucasiggcas commented 4 years ago

hi,感谢he老师回复我,很荣幸啊。

ucasiggcas commented 4 years ago

请教下是否设置了 最小的epoch,只训练10次则出错,如下

Epoch 1 [135.3s]: train==[0.44135=0.44111 + 0.00024]
Epoch 2 [117.2s]: train==[0.24107=0.24042 + 0.00064]
Epoch 3 [117.7s]: train==[0.20468=0.20386 + 0.00082]
Epoch 4 [118.1s]: train==[0.18239=0.18144 + 0.00095]
Epoch 5 [116.4s]: train==[0.16441=0.16335 + 0.00107]
Epoch 6 [116.1s]: train==[0.15041=0.14923 + 0.00118]
Epoch 7 [116.6s]: train==[0.13803=0.13674 + 0.00129]
Epoch 8 [116.8s]: train==[0.12793=0.12654 + 0.00139]
Epoch 9 [118.0s]: train==[0.11918=0.11768 + 0.00150]
Epoch 10 [117.0s]: train==[0.11208=0.11048 + 0.00160]
Traceback (most recent call last):
  File "LightGCN.py", line 700, in <module>
    best_rec_0 = max(recs[:, 0])
IndexError: too many indices for array
kuandeng commented 4 years ago

你可以检查一下是否编译成功,编译成功的话会显示eval_score_matrix_foldout with cpp。

kuandeng commented 4 years ago

默认设置为20个epoch评测一次结果,epoch数目小于20会报上述错误。

ucasiggcas commented 4 years ago

你可以检查一下是否编译成功,编译成功的话会显示eval_score_matrix_foldout with cpp。

这句有的。

默认设置为20个epoch评测一次结果,epoch数目小于20会报上述错误。 发现了。

多谢

另外,请教下关于tf.summary是不是不影响模型参数(权重啥的), 2,训练的user和测试的user相同吗?如何做增量训练? 多谢

ucasiggcas commented 4 years ago

请教下怎么分割训练集和测试集? 测试集一定有很多item? 其中的item是否有时间顺序? 测试集可以是1个item吗?这样影响实际推荐结果吗? (肯定影响测试集的评价指标 1-将最后5个点击item作为测试,如下结果

Epoch 1 [14.0s]: train==[0.26441=0.26357 + 0.00084]
Epoch 2 [10.0s]: train==[0.04577=0.04388 + 0.00189]
Epoch 3 [10.3s]: train==[0.02021=0.01797 + 0.00224]
Epoch 4 [10.2s]: train==[0.01369=0.01119 + 0.00251]
Epoch 5 [10.1s]: train==[0.01100=0.00830 + 0.00269]
users_to_test 1: [    0     1     2 ... 21696 21697 21698] 21699
Epoch 6: train==[0.00966=0.00682 + 0.00284 + 0.00000], recall=[0.65679], precision=[0.13468], ndcg=[0.49974]
users_to_test 2: [    0     1     2 ... 21696 21697 21698] 21699
Epoch 6 [26.3s + 2.5s]: test==[0.55234=0.54956 + 0.00278 + 0.00000], recall=[0.15073], precision=[0.01507], ndcg=[0.07697]
use Time=[88.9], recall=[0.15073], precision=[0.01507], ndcg=[0.07697]

2-将最后1个item作为测试,如下:其中训练参数相同,没有改变

Epoch 1 [18.3s]: train==[0.25948=0.25852 + 0.00096]
Epoch 2 [13.6s]: train==[0.04445=0.04241 + 0.00204]
Epoch 3 [13.6s]: train==[0.02078=0.01810 + 0.00268]
Epoch 4 [13.5s]: train==[0.01467=0.01161 + 0.00306]
Epoch 5 [13.7s]: train==[0.01136=0.00800 + 0.00335]
users_to_test 1: [    0     1     2 ... 21696 21697 21698] 21699
Epoch 6: train==[0.01030=0.00673 + 0.00357 + 0.00000], recall=[0.59860], precision=[0.17298], ndcg=[0.48761]
users_to_test 2: [    0     1     2 ... 21696 21697 21698] 21699
Epoch 6 [33.5s + 2.7s]: test==[0.49819=0.49462 + 0.00357 + 0.00000], recall=[0.17457], precision=[0.00349], ndcg=[0.05626]
use Time=[114.4], recall=[0.17457], precision=[0.00349], ndcg=[0.05626]

ucasiggcas commented 4 years ago

请教下如何得到每个user的推荐的item列表呢?

hexiangnan commented 4 years ago

你可以check NeuRec的实现,快很多:https://github.com/wubinzzu/NeuRec

On Fri, Oct 9, 2020 at 3:09 PM VideoRec notifications@github.com wrote:

请教下如何得到每个user的推荐的item列表呢?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

-- Xiangnan HE (Dr) :: Professor :: Vice Dean, School of Data Science :: University of Science and Technology of China (USTC) :: 443 Huangshan Road, Hefei, China 230027 :: (0551) 63607236 (DID) :: hexn@ustc.edu.cn (E) :: http://staff.ustc.edu.cn/~hexn/ (W)

RS-Bugmaker commented 3 years ago

Hello, when I run the lightGCN code, the CPU is always training, and the GPU is not used. What is the reason?