marius-team / marius

Large scale graph learning on a single machine.
https://marius-project.org
Apache License 2.0
160 stars 45 forks source link

README example not working #23

Closed cthoyt closed 3 years ago

cthoyt commented 3 years ago

Describe the bug

Traceback (most recent call last):
  File "/Users/cthoyt/dev/marius/test.py", line 20, in <module>
    fb15k_example()
  File "/Users/cthoyt/dev/marius/test.py", line 8, in fb15k_example
    train_set, eval_set = m.initializeDatasets(config)
RuntimeError: filesystem error: in copy_file: No such file or directory [training_data/marius/edges/train/edges.bin] [output_dir/train_edges.pt]

To Reproduce

I took the example from the README verbatim besides fixing the config path

import marius as m

def fb15k_example():
    config_path = "/Users/cthoyt/dev/marius/examples/training/configs/kinships_cpu.ini"
    config = m.parseConfig(config_path)

    train_set, eval_set = m.initializeDatasets(config)

    model = m.initializeModel(config.model.encoder_model, config.model.decoder_model)

    trainer = m.SynchronousTrainer(train_set, model)
    evaluator = m.SynchronousEvaluator(eval_set, model)

    trainer.train(1)
    evaluator.evaluate(True)

if __name__ == "__main__":
    fb15k_example()

Expected behavior A clear and concise description of what you expected to happen.

Environment Mac os 11.2.3 big sur, python 3.9.2, pip installed from latest code on marius

cthoyt commented 3 years ago

I think the issue is that it's making tons of assumptions about file structure

JasonMoho commented 3 years ago

So that example in the readme requires preprocessing the dataset first. It's not very clear about that, I'll update it to be more clear.

Try running python3 marius/tools/preprocess.py kinships output_dir/ and doing the example again.

cthoyt commented 3 years ago

seems like the kind of thing that motivates a higher level interface than the C code that can also mix in python-only tools, like ensuring that the preprocess script got run properly

cthoyt commented 3 years ago

New error:

/Users/cthoyt/.virtualenvs/marius/bin/python /Users/cthoyt/dev/marius/test.py
[2021-04-16 01:31:00.402] [info] [io_wrap.cpp:36] Training set initialized
[2021-04-16 01:31:00.402] [info] [io_wrap.cpp:40] Evaluation set initialized
[2021-04-16 01:31:00.403] [info] [trainer.cpp:66] ################ Starting training epoch 1 ################

Process finished with exit code 136 (interrupted by signal 8: SIGFPE)
JasonMoho commented 3 years ago

Try with fb15k. I think this issue is that the configuration file for kinships has a batch size of 1000 edges. This is larger than the dataset, which only has 100 edges.

Thanks for sniffing all these issues out, this is great 😄