pykeen / benchmarking

📊 Results from the reproducibility and benchmarking studies presented in "Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework" (http://arxiv.org/abs/2006.13365)
MIT License
35 stars 4 forks source link

Using custom pykeen dataset for ablation study #29

Closed serenalotreck closed 1 year ago

serenalotreck commented 1 year ago

Hi All,

Great work on this paper and thanks for such a thorough repository!

I'd like to perform the same hyperparameter tuning for a model trained on my own dataset, which I've added to pykeen by following the Extending the Datasets instructions in the PyKEEN docs for my local installation. I'm able to import my own dataset properly after doing this. My dataset is named pickle.

I was hoping that I could then just use the command

python ablation/search.py --dataset pickle

to run the hyperparameter search for my dataset, but it runs quickly (~30 seconds) and silently (no errors or print statements) without producing any output.

I took a look at the code to see if I was missing some other adaptation that I had to make in order to run it on a new dataset, but didn't see anything that looked like I needed to change it.

Wondering what I'm missing here?

mberr commented 1 year ago

Hi @serenalotreck ,

this repository contains the experimental results in ./ablation. The ablation/search.py only searches through the JSON files containing the experimental results.

To create such results with PyKEEN, take a look at the Running HPO tutorial.

cthoyt commented 1 year ago

I don't recall where we wrote it before, but the ablation study code isn't meant to be a feature that's extensible. Like Max said, anyone can roll their own with the HPO pipeline

serenalotreck commented 1 year ago

Ah that wasn't clear to me, thank you!