Closed LucaCappelletti94 closed 2 years ago
Exposed parameter use_zipfian_sampling
for the TRAINING of sklearn models and for the EVALUATION of models in the pipeline in commit 8cfed383af2822acf04f40c886415b560f68e1ac.
Furthermore, extended the documentation on the available evaluation schemas available in the pipeline.
An extensive warning about the bias has been added to warn users to against disabling the zipfian sampling, so to avoid future papers titled "uniform edge sampling boosts performance".
@pnrobinson do let me know whether the use of this parameter seems clear and whether it is properly documented. I will soonish make the updated GraPE / Embiggen available on Pypi as soon as I fix some of the other reported issues.
As for @pnrobinson's request, the
use_zipfian_sampling
parameter should be exposed so to more easily evaluate the impact of the uniform/zipfian negative edges sampling bias in the evaluation of a model.