MKLab-ITI / pygrank

Recommendation algorithms for large graphs
Apache License 2.0
29 stars 4 forks source link

Tune on non-seeds? #17

Closed deklanw closed 1 year ago

deklanw commented 1 year ago

Is it possible to run the tuners with non-seed nodes? For example if I have a seed_set and a target_set can I run the tuner diffusions with the signal from the former but optimize for metrics defined with respect to the latter? In this case I have a desired ranking of the nodes in the target_set.

maniospas commented 1 year ago

This is a useful use case that should have probably been explicitly supported by the interface. Right now, you need to create a lambda expression to be used as a measure constructor that always returns the same instance:

measure = pg.AUC(target_set)  # or spearman or pearson correlation if your target set is non-binary
algorithm = pg.ParameterTuner(measure=lambda *args: measure, fraction_of_training=1)  # uses 100% of seeds when running algorithms internally
print(algorithm(graph, seed_set))

P.S. For large graphs, you might be interested in algorithm selection instead of granular parameter tuning:

competing_algorithms = pg.create_many_filters().values()
algorithm = pg.AlgorithmSelection(competing_algorithms , measure=lambda *args: measure, fraction_of_training=1)

This will just run through a predetermined set of filters. You can provide a custom list of algorithms you suspect will work well if you have some specific ones in mind. Note that you can also create algorithms with binary outcomes like this: pg.HeatKernel(3) >> pg.Threshold(0.1) though you shouldn't use AUC to evaluate binary outcomes.

deklanw commented 1 year ago

Thanks, this is perfect