gnn4dr / DRKG

A knowledge graph and a set of tools for drug repurposing
Apache License 2.0
565 stars 153 forks source link

Failed to run for the dataset DRKG #21

Open chiajungchang opened 3 years ago

chiajungchang commented 3 years ago

Hi,

Thanks for all the work. It looks amazing and I am looking forward to integrating my data with other diseases. It's a pity that I cannot run the code for training DRKG on my machine, which only has CPUs.

The command is "dglke_train --dataset DRKG --data_path ./train --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 64 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 100000 --log_interval 1000 --batch_size_eval 16 -adv --regularization_coef 1.00E-07 --test --num_thread 1 --num_proc 8 --neg_sample_size_eval 10000"

and the output is Reading train triples.... Finished. Read 5286834 train triples. Reading valid triples.... Finished. Read 293713 valid triples. Reading test triples.... Finished. Read 293713 test triples. |Train|: 5286834 random partition 5286834 edges into 8 parts part 0 has 660855 edges part 1 has 660855 edges part 2 has 660855 edges part 3 has 660855 edges part 4 has 660855 edges part 5 has 660855 edges part 6 has 660855 edges part 7 has 660849 edges /opt/conda/lib/python3.7/site-packages/dgl/base.py:25: UserWarning: multigraph will be deprecated.DGL will treat all graphs as multigraph in the future. warnings.warn(msg, warn_type) |valid|: 293713 |test|: 293713 Bus error (core dumped)

The command works fine for other data. "dglke_train --model_name TransE_l2 --dataset FB15k --batch_size 1000 --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 3000 --log_interval 100 --batch_size_eval 16 --test -adv --regularization_coef 1.00E-09 --num_thread 1 --num_proc 8" worked successfully.

Thanks again for your sharing.

classicsong commented 3 years ago

Can you give following information: DGL-ke version DGL version PyTorch version CPU or GPU (it seems it is CPU)

chiajungchang commented 3 years ago

Hi DRKG team,

Thanks for your quick response.

The versions are

dgl 0.4.3 dglke 0.1.1 torch 1.6.0 CPU

Best, Chia-Jung

On Jan 14, 2021, at 10:19 PM, xiang song(charlie.song) notifications@github.com<mailto:notifications@github.com> wrote:

Can you give following information: DGL-ke version DGL version PyTorch version CPU or GPU (it seems it is CPU)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gnn4dr/DRKG/issues/21#issuecomment-760679125, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGL3ROK7ECGUVOW2G37VDMTSZ7M67ANCNFSM4WDROREA.

classicsong commented 3 years ago

Can you try --num_proc 1? How many memory your CPU machine have?

chiajungchang commented 3 years ago

Thanks a lot! It works. The machine has 64 G memory.

On Jan 14, 2021, at 10:29 PM, xiang song(charlie.song) notifications@github.com<mailto:notifications@github.com> wrote:

Can you try --num_proc 1? How many memory your CPU machine have?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gnn4dr/DRKG/issues/21#issuecomment-760683229, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGL3ROPSGOKGL2HU4SSOXDDSZ7OFFANCNFSM4WDROREA.

classicsong commented 3 years ago

Maybe it is due to OOM problem. You can accordingly increase the --num_thread and also try --num_proc 2 or 4

chiajungchang commented 3 years ago

Thanks. Only --num_proc 1 works and it only takes 2% of memory. There might be other causes but it is good enough for me now. Thanks again.

On Jan 14, 2021, at 10:37 PM, xiang song(charlie.song) notifications@github.com wrote:



Maybe it is due to OOM problem. You can accordingly increase the --num_thread and also try --num_proc 2 or 4

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gnn4dr/DRKG/issues/21#issuecomment-760686569, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGL3ROLPROLET4YQ2TZJRW3SZ7PDLANCNFSM4WDROREA.

yxu1168 commented 1 year ago

Hello @classicsong,

I tried to use anaconda Jupyter to run Train_embeddings Notebook using CPU. The command is: "!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 512 \ --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 100000 --log_interval 1000 --batch_size_eval 16 -adv --regularization_coef 1.00E-07 --test --num_thread 1 --num_proc 1 --neg_sample_size_eval 10000 "

Got error below: 'DGLBACKEND' is not recognized as an internal or external command, operable program or batch file.

If I remove !DGLBACKEND=pytorch, I got another error: File "", line 1 dglke_train --dataset DRKG --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 512 \ ^ SyntaxError: invalid syntax

Any advice/idea to fix the issue? Thank you very much!