dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.51k stars 3.02k forks source link

Broken KE example #1302

Closed ka1319 closed 4 years ago

ka1319 commented 4 years ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. cd apps/kg
  2. DGLBACKEND=pytorch python train.py --model TransE_l1 --dataset FB15k --batch_size 256 --neg_sample_size 64 --batch_size_eval 16 --eval_interval 100 --max_step 100
Logs are being recorded at: ckpts/TransE_l1_FB15k_24/train.log
|Train|: 483142
Total data loading time 1.098 seconds
Traceback (most recent call last):
  File "train.py", line 405, in <module>
    run(args, logger)
  File "train.py", line 360, in run
    train(args, model, train_sampler, valid_samplers, rel_parts=rel_parts)
  File "/Users/kahrabian/projects/dgl/apps/kg/train_pytorch.py", line 125, in train
    loss, log = model.forward(pos_g, neg_g, gpu_id)
  File "/Users/kahrabian/projects/dgl/apps/kg/models/general_models.py", line 360, in forward
    neg_deg_sample=self.args.neg_deg_sample)
  File "/Users/kahrabian/projects/dgl/apps/kg/models/general_models.py", line 269, in predict_neg_score
    num_chunks, chunk_size, neg_sample_size)
  File "/Users/kahrabian/projects/dgl/apps/kg/models/pytorch/score_fun.py", line 81, in fn
    tails = tails.reshape(num_chunks, neg_sample_size, hidden_dim)
RuntimeError: shape '[4, 64, 256]' is invalid for input of size 65024

Expected behavior

Run the example!

Environment

zheng-da commented 4 years ago

thanks for reporting the problem. The package is under development and the code is being cleaned up. Once we finish cleaning up, we'll let you know.

classicsong commented 4 years ago
DGLBACKEND=pytorch python train.py --model TransE_l1 --dataset FB15k --batch_size 256 --neg_sample_size 64 --batch_size_eval 16 --eval_interval 100 --max_step 100

Worked well on my machine with following config: ubutnu 18.04 DGL-install: sudo pip3 install --pre dgl-cu101 DGL-KGE: git clone.

How did you install the dgl in your mac? Can you try build from source? The current DGL-KGE package does not compact with dgl-0.4.2

ka1319 commented 4 years ago

It's installed with pip install --pre dgl, I'll try building it from the source.

classicsong commented 4 years ago

It's installed with pip install --pre dgl, I'll try building it from the source.

hi, did you successfully set up the KG training?

ka1319 commented 4 years ago

It's installed with pip install --pre dgl, I'll try building it from the source.

hi, did you successfully set up the KG training?

I got the following error:

-- Start configuring project dgl
CMake Error at CMakeLists.txt:128 (add_subdirectory):
  The source directory

    /Users/kahrabian/projects/dgl/third_party/dmlc-core

  does not contain a CMakeLists.txt file.

CMake Error at CMakeLists.txt:135 (include):
  include could not find load file:

    third_party/METIS/GKlib/GKlibSystem.cmake

CMake Error at CMakeLists.txt:137 (add_subdirectory):
  add_subdirectory given source "third_party/METIS/libmetis/" which is not an
  existing directory.

-- Configuring incomplete, errors occurred!
See also "/Users/kahrabian/projects/dgl/build/CMakeFiles/CMakeOutput.log".
classicsong commented 4 years ago

You should do:

git submodule init
git submodule update --recursive

before building from source.

Also, if your are build in MacOS, please follow https://docs.dgl.ai/en/latest/install/index.html#macos

ka1319 commented 4 years ago

I'm getting the following error now:

Logs are being recorded at: ckpts/DistMult_FB15k_6/train.log
File not found. Downloading from https://data.dgl.ai/dataset/FB15k.zip
Download finished. Unzipping the file...
Unzip finished.
|Train|: 483142
|valid|: 50000
|test|: 59071
Total initialize time 6.449 seconds
Traceback (most recent call last):
  File "train.py", line 380, in <module>
    run(args, logger)
  File "train.py", line 315, in run
    train(args, model, train_sampler, valid_samplers, rel_parts=rel_parts)
  File "/Users/kahrabian/projects/dgl/apps/kg/train_pytorch.py", line 117, in train
    pos_g, neg_g = next(train_sampler)
  File "/Users/kahrabian/projects/dgl/apps/kg/dataloader/sampler.py", line 680, in __next__
    pos_g, neg_g = next(self.iterator_tail)
  File "/Users/kahrabian/projects/dgl/apps/kg/dataloader/sampler.py", line 689, in one_shot_iterator
    is_chunked, neg_head, num_nodes)
  File "/Users/kahrabian/projects/dgl/apps/kg/dataloader/sampler.py", line 453, in create_neg_subgraph
    neg_sample_size, neg_head)
  File "/Users/kahrabian/projects/dgl/apps/kg/dataloader/sampler.py", line 385, in __init__
    parent=subg._parent)
TypeError: __init__() got an unexpected keyword argument 'parent'
classicsong commented 4 years ago

Did you do the

cd python
sudo python3 setup.py install?
Soontosh commented 3 months ago

If none of the above works for you, try this: https://github.com/onnx/onnx-tensorrt/issues/354#issuecomment-572279735