Open Filco306 opened 1 year ago
To further update; I realized I constrained my search to only using sLCWA runs, so the one above does not correspond to the run presented in the paper (Table 18). However, switching to the one in the paper give me a 0.94 instead of 0.98 on ComplEx, but I think that should be good enough given that result can never be exact. RotatE also seem to give decent results on Kinship now (0.98 hits@10). But I still do not know why my results are remarkably lower for the settings above.
The validation curves for the different models are also very strange, but I guess that is a consequence of large hyperparameter tuning :)
Hi @Filco306
if you are looking at the validation curves generated by the EvaluationLoopTrainingCallback
,
if "callbacks" not in config["pipeline"]["training_kwargs"]:
config["pipeline"]["training_kwargs"]["callbacks"] = ["evaluation-loop"]
if "callback_kwargs" not in config["pipeline"]["training_kwargs"]:
config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation"
}
you may be missing to filter with training triples, too. To do so, you would need to pass the additional key additional_filter_triples
to callback_kwargs
, i.e.,
config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"additional_filter_triples": dataset.training,
}
This is a bit hidden, since this parameter goes from the EvaluationLoopTrainingCallback.__init__
via kwargs
through pykeen.evaluation.Evaluator.evaluate
to pykeen.evaluation.evaluate
😅
Hi there,
Thank you for your reply! :D I will re-run the experiment in question with your comment in my mind and see if the fixes the results. If not, I'll get back to you :)
Thanks! :D
One more thing I noticed: https://pykeen.readthedocs.io/en/stable/api/pykeen.training.callbacks.EvaluationLoopTrainingCallback.html also needs the factory on which to evaluate, i.e.,
config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"factory": dataset.validation,
"additional_filter_triples": dataset.training,
}
Hi again @mberr ,
If I add what you write,
config["pipeline"]["training_kwargs"]["callback_kwargs"] = {
"prefix": "validation",
"factory": dataset.validation,
"additional_filter_triples": dataset.training,
}
I get the error:
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/training/training_loop.py", line 378, in train
result = self._train(
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/training/training_loop.py", line 734, in _train
callback.post_epoch(epoch=epoch, epoch_loss=epoch_loss)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/training/callbacks.py", line 438, in post_epoch
callback.post_epoch(epoch=epoch, epoch_loss=epoch_loss, **kwargs)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/training/callbacks.py", line 325, in post_epoch
result = self.evaluation_loop.evaluate(**self.kwargs)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/evaluation/evaluation_loop.py", line 196, in evaluate
return _evaluate(
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/torch_max_mem/api.py", line 293, in inner
result, self.parameter_value[h] = wrapped(*args, **kwargs)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/torch_max_mem/api.py", line 193, in wrapper_maximize_memory_utilization
func(*bound_arguments.args, **p_kwargs, **bound_arguments.kwargs),
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/evaluation/evaluation_loop.py", line 82, in _evaluate
loader = loop.get_loader(batch_size=batch_size, **kwargs)
File "/home/users/filip/.conda/envs/kgexperiments/lib/python3.10/site-packages/pykeen/evaluation/evaluation_loop.py", line 149, in get_loader
return DataLoader(
TypeError: DataLoader.__init__() got an unexpected keyword argument 'additional_filter_triples'
Would you know what the issue is here? Note that I still get the warning WARNING:pykeen.evaluation.evaluation_loop:Enabled filtered evaluation, but not additional filter triples are passed.
, so it does not seem to be passed properly.
Okay, this seems to be a bug in EvaluationLoop
, which does not properly forward this argument to instantiate the LCWAEvaluationDataset
here.
I used this smaller snippet to reproduce your error
from pykeen.pipeline import pipeline
from pykeen.datasets import get_dataset
dataset = get_dataset(dataset="nations")
result = pipeline(
dataset=dataset,
model="mure",
training_kwargs=dict(
num_epochs=5,
callbacks="evaluation-loop",
callback_kwargs=dict(
frequency=1,
prefix="validation",
factory=dataset.validation,
additional_filter_triples=dataset.training,
),
),
)
EDIT: I opened a ticket here: https://github.com/pykeen/pykeen/issues/1213
Hello again @mberr ,
Thank you for this! Yes, I believe it is a bug. Thank you for flagging it!
Hello!
Thank you for a nice study and a nice repository! :D I am currently trying to re-use some of your hyperparameters from the study, e.g., those for
Complex
for theYAGO3-10
dataset. However, upon trying to use the config files with the current version of Pykeen, I am landing with the error thatowa
is not an option, but only['lcwa', 'slcwa']
are valid options. I saw that you changed the name of OWA to SLCWA, so I switched from OWA to SLCWA as instructed.However, training locally with Pykeen 1.9.0 and
slcwa
gives me very different results; on the validation set, I get extremely low results, although I seem to get pretty decent (but still far from the outputted metrics there) on the test set. For this specific run, I got for the corresponding metrics:I attach my training script below; I am most likely doing something wrong or not considering some specific setting that was updated in the more recent version of pykeen. Thanks again for a nice tool! :)
The results from the database can be seen below.
Config file (originally this one):
Running the following gives me great output metrics:
Version:
Is there some change in the packages since it was last run that causes this mismatch, or am I perhaps using the package incorrectly? Thank you for your time, and thank you for your package!