monarch-initiative / embiggen

🍇 Embiggen is the Python Graph Representation learning, Prediction and Evaluation submodule of the GRAPE library.
BSD 3-Clause "New" or "Revised" License
41 stars 12 forks source link

Exposing parameter for `use_zipfian_sampling` #268

Closed LucaCappelletti94 closed 2 years ago

LucaCappelletti94 commented 2 years ago

As for @pnrobinson's request, the use_zipfian_sampling parameter should be exposed so to more easily evaluate the impact of the uniform/zipfian negative edges sampling bias in the evaluation of a model.

LucaCappelletti94 commented 2 years ago

Exposed parameter use_zipfian_sampling for the TRAINING of sklearn models and for the EVALUATION of models in the pipeline in commit 8cfed383af2822acf04f40c886415b560f68e1ac.

Furthermore, extended the documentation on the available evaluation schemas available in the pipeline.

An extensive warning about the bias has been added to warn users to against disabling the zipfian sampling, so to avoid future papers titled "uniform edge sampling boosts performance".

@pnrobinson do let me know whether the use of this parameter seems clear and whether it is properly documented. I will soonish make the updated GraPE / Embiggen available on Pypi as soon as I fix some of the other reported issues.