awslabs / dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
https://dglke.dgl.ai/doc/
Apache License 2.0
1.28k stars 196 forks source link

dglke_predict | #209

Closed ccvalley closed 3 years ago

ccvalley commented 3 years ago

I noticed someone else mentioned this in another issue (https://github.com/awslabs/dgl-ke/issues/178), but I haven't seen it resolved.

I'm running dglke_predict on trained embeddings and having issues making predictions on h_r_t format, specifically with the relations list file. The h and t files work in dglke_emb_sim as L and R input files.

This also gives the same error if I run dglkepredict using `--format 'h*_t'` and eliminate the rel.list input file.

!DGLBACKEND=pytorch dglke_predict \
  --model_path /ckpts/TransE_l2_KGE_test_0/ \
  --format 'h_r_t' \
  --data_files /data/head.list \
                       /data/rel.list 
                       /data/tail.list \
  --exec_mode batch_head \
  --score_func logsigmoid \
  --topK 5 \
  --output predict.tsv

Is giving this error:

Traceback (most recent call last):
  File "/python3/bin/dglke_predict", line 8, in <module>
    sys.exit(main())
  File "/python3/lib/python3.7/site-packages/dglke/infer_score.py", line 216, in main
    result = model.topK(head, rel, tail, args.exec_mode, args.topK)
  File "/python3/lib/python3.7/site-packages/dglke/models/infer.py", line 173, in topK
    F.asnumpy(rel[rel_idx]),
IndexError: tensors used as indices must be long, byte or bool tensors

Thank you.

classicsong commented 3 years ago

You run it as ?

!DGLBACKEND=pytorch dglke_predict \
  --model_path /ckpts/TransE_l2_KGE_test_0/ \
  --format 'h_*_t' \
  --data_files /data/head.list \
                       /data/tail.list \
  --exec_mode batch_head \
  --score_func logsigmoid \
  --topK 5 \
  --output predict.tsv
ccvalley commented 3 years ago

I've tried running it both with and without the relation.list input and receive the same error.

Both of these runs give the same error listed above:

!DGLBACKEND=pytorch dglke_predict \
  --model_path /ckpts/TransE_l2_KGE_test_0/ \
  --format 'h_r_t' \
  --data_files /data/head.list  /data/rel.list  /data/tail.list \
  --exec_mode batch_head \
  --score_func logsigmoid \
  --topK 5 \
  --output predict.tsv
!DGLBACKEND=pytorch dglke_predict \
  --model_path /ckpts/TransE_l2_KGE_test_0/ \
  --format 'h_*_t' \
  --data_files /data/head.list  /data/tail.list \
  --exec_mode batch_head \
  --score_func logsigmoid \
  --topK 5 \
  --output predict.tsv
ccvalley commented 3 years ago

If it helps, here are the versions for DGL, DGLKE, and numpy (which seems to be throwing the error at line 173 in the topK function):

dgl: '0.4.3post2' (we've also tried with 0.4.3 and 0.5 and receive the same error) dglke: '0.1.2' numpy: '1.18.1'

Please let me know if you need any additional information. Thanks!

classicsong commented 3 years ago

Which pytorch version you are using?

classicsong commented 3 years ago

Can you try install dglke from source? This commit https://github.com/awslabs/dgl-ke/commit/e770b4ee96e51a3919bc4196791192dc625ab06b fixed a bug related to th.floor_divide. I am not sure whether it is included in kge-0.1.2

ccvalley commented 3 years ago

Hi @classicsong - thanks for the help. Installing from source ('0.1.0.dev') seems to have fixed the issue we're seeing when installing the latest version (0.1.2).

The issue in v0.1.2 seems to be in this line within topK function in infer.py:

idx = idx / num_tail

Which was corrected to use the th.floor_divide in v0.1.0.dev:

idx = floor_divide(idx, num_tail)

Is it possible to push these changes to kge-0.1.2 or a never version which we can install using PyPI/pip?

Thanks!

classicsong commented 3 years ago

Currently, you can only install from source. We will release 0.1.3 later to include this bug fix.