Weird issue about the rerun

blindsubmission1 / PEAGNN

Official code for the paper "Leveraging the Metapath and Entity Aware Subgraphs for Recommendation"

GNU General Public License v3.0

40 stars 11 forks source link

Weird issue about the rerun #2

Open Chuan1997 opened 3 years ago

Chuan1997 commented 3 years ago

I ran the command on latest small movielens dataset twice _python peagcn_solver_bpr.py --dataset=Movielens --dataset_name=latest-small --num_core=10 --num_feat_core=10 --sampling_strategy=unseen --entity_aware=false --dropout=0 --emb_dim=64 --repr_dim=16 --hidden_size=64 --meta_path_steps=2,2,2,2,2,2,2,2,2 --entity_aware_coff=0.1 --init_eval=true --gpu_idx=0 --runs=5 --epochs=30 --batch_size=1024 --save_every_epoch=5 --metapathtest=false the first time is normal, but the second time I got this: Dataset loaded! Overall HR@5: 0.3803, HR@10: 0.5362, HR@15: 0.6329, HR@20: 0.7010, NDCG@5: 0.2503, NDCG@10: 0.2999, NDCG@15: 0.3246, NDCG@20: 0.3399, AUC: 0.8242, train loss: 259.2449, eval loss: 43.4085 It actually didn't running.

jaiabhayk commented 3 years ago

I ran the command on latest small movielens dataset twice _python peagcn_solver_bpr.py --dataset=Movielens --dataset_name=latest-small --num_core=10 --num_feat_core=10 --sampling_strategy=unseen --entity_aware=false --dropout=0 --emb_dim=64 --repr_dim=16 --hidden_size=64 --meta_path_steps=2,2,2,2,2,2,2,2,2 --entity_aware_coff=0.1 --init_eval=true --gpu_idx=0 --runs=5 --epochs=30 --batch_size=1024 --save_every_epoch=5 --metapathtest=false the first time is normal, but the second time I got this: Dataset loaded! Overall HR@5: 0.3803, HR@10: 0.5362, HR@15: 0.6329, HR@20: 0.7010, NDCG@5: 0.2503, NDCG@10: 0.2999, NDCG@15: 0.3246, NDCG@20: 0.3399, AUC: 0.8242, train loss: 259.2449, eval loss: 43.4085 It actually didn't running.

How long does it take to run for full or small dataset ?

ecom-research commented 3 years ago

I think first time when you run the code, it loads the dataset in a graph manner and saves it on the disk. When you run it the second time it loads the dataset from the disk and starts training. The training for ml small is quite fast.

For bigger data sets it takes relatively more Time.

ecom-research commented 3 years ago

You can turn off the saving/loading of the dataset/graph model. There is option in the code.

Chuan1997 commented 3 years ago

Thanks for the guidance. There is another question, when I tried to run the baseline model like kgat(after I run the PEAGNN), the downloading process repeated again. _No loggers found at 'D:\Study\reproduction\PEAGNN\experiments\checkpoint\loggers\Movielenslatest-small\KGAT\BPR{'model_ty\globallogger.pkl' Downloading http://files.grouplens.org/datasets/movielens/ml-latest-small.zip Extracting D:\Study\reproduction\PEAGNN\experiments\checkpoint\data\Movielenslatest-small\raw\ml-latest-small.zip Processing... Data frame not found in D:\Study\reproduction\PEAGNN\experiments\checkpoint\data\Movielenslatest-small\processed! Read from raw data and preprocessing from D:\Study\reproduction\PEAGNN\experiments\checkpoint\data\Movielenslatest-small\raw\ml-latest-small! Raw files not found! Reading directories and actors from api! Get item resources: 7%|█████▌ | 702/9742 [08:44<1:29:49, 1.68it/s] The process is extremely slow, how can I avoid it and speed up the training process?