Closed zheng-da closed 4 years ago
I'd really appreciate this. For example, on https://aws-dglke.readthedocs.io/en/latest/train_user_data.html It's not super clear what should be in --data_path
and --data_files
.
For example, --data_path
says "to specify the path to the knowledge graph dataset"; however, I presume this means "to specify the path to the folder containing the knowledge graph dataset".
Also, --data_files
says "to specify the triplets of a knowledge graph as well as node/relation ID mapping"; however, it's not immediately clear the order of these files. For example, I would presume this would follow the order of the files listed under udd_[h|r|t]
:
DGLBACKEND=pytorch dglke_train \
--data_path results_SXSW_2018 \
--data_files entities.tsv relations.tsv train.tsv valid.tsv test.tsv \
--format udd_hrt \
--model_name ComplEx \
--max_step 12000 --batch_size 1000 --neg_sample_size 200 --batch_size_eval 16 \
--hidden_dim 400 --gamma 19.9 --lr 0.25 --regularization_coef=1e-9 -adv \
--gpu 0 1 --async_update --force_sync_interval 1000 --log_interval 1000 \
--test
^^^ But the order isn't clear. It seems like entities.txt
and relations.tsv
should go at the end since if someone uses to raw_udd_[h|r|t]
option this would keep the first three elements consistently for training, validation, and testing files.
Perhaps there should be --data_tuple_files
and --data_mapping_files
options?
UPDATE: When I ran the code above, it gave me this output with FB_15k in the checkpoints, which doesn't seem right...
(dglke) amruch@wit:~/graphika/kg$ DGLBACKEND=pytorch dglke_train --data_path results_SXSW_2018 --data_files entities.tsv relations.tsv train.tsv valid.tsv test.tsv--format udd_hrt --model_name ComplEx --max_step 12000 --batch_size 1000 --neg_sample_size 200 --batch_size_eval 16 --hidden_dim 400 --gamma 19.9 --lr 0.25 --regularization_coef=1e-9 -adv --gpu 0 1 --async_update --force_sync_interval 1000 --log_interval 1000 --test
Using backend: pytorch
Logs are being recorded at: ckpts/ComplEx_FB15k_0/train.log
Reading train triples....
Thank you very much for your feedback. We'll prioritize it and provide documentation of the argument options.
If you find the explanation from --help
isn't clear, please post them here. We'll improve them. Thanks a lot for your help.
This is great! I didn’t know that was an option. I tried man dglke_train
and didn’t see anything, but the output I’m seeing from -- help
looks great!
We need to clarify our documentation to address all of the questions in this issue: https://github.com/awslabs/dgl-ke/issues/84
The docs for command line arguments was updated along with 0.1.1 release.
Thanks for the heads up!
On Aug 26, 2020, at 10:15 PM, xiang song(charlie.song) notifications@github.com wrote:
The docs for command line arguments was updated along with 0.1.1 release.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/awslabs/dgl-ke/issues/83#issuecomment-681301582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIYWOJYQDMPPLHP3K6WRXTSCW6VRANCNFSM4MP3SRKA.
We need to explain the arguments of commands.