awslabs / dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
https://dglke.dgl.ai/doc/
Apache License 2.0
1.27k stars 195 forks source link

stuck in save model steps #175

Closed Reid00 closed 3 years ago

Reid00 commented 3 years ago

my result stuck in save model, it should be exit the program, but stuck. image

i check the model save path , the relative files is saved. image

classicsong commented 3 years ago

It maybe in testing. Testing is costly especially your graph is very large. You can set neg_sample_size_eval to 10000 for example.

Reid00 commented 3 years ago

DGLBACKEND=pytorch dglke_train --model_name TransE_l2 --dataset patient --batch_size 1000 --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 24000 --log_interval 100 --batch_size_eval 16 -adv --regularization_coef 1.00E-09 --test --gpu 0 --mix_cpu_gpu --data_path ./data/ --format raw_udd_hrt --data_files train.txt valid.txt test.txt --neg_sample_size_eval 10000

this is my command, i have set neg_sample_size_eval 10000

classicsong commented 3 years ago

You need to check if CPU usage is very high when it is 'stucked'

Reid00 commented 3 years ago

Got it, thank you a million.