ULTR-Community / ULTRA_pytorch

Unbiased Learning To Rank Algorithms (ULTRA)
https://ultr-community.github.io/ULTRA_pytorch/
Apache License 2.0
93 stars 8 forks source link

usage from Python #1

Open cmacdonald opened 3 years ago

cmacdonald commented 3 years ago

Hi,

Thanks for releasing this interesting platform.

I'm interested in how ULTRA could be used from another Python library - e.g. say I wanted to use an ULTRA model to re-rank results in PyTerrier.

With sklearn, xgboost, fastrank etc, I can give it an array of feature values for a given document, and it will return a score - see https://pyterrier.readthedocs.io/en/latest/ltr.html#learning for our integration.

Does ULTRA have a similar API?

QingyaoAi commented 3 years ago

Thanks for your kind comment!

Yes, I do think it's possible to use the ranking model build with ULTRA in PyTerrier. While the APIs are not exactly the same, it should be fairly easy to build an adapter to connect them. For example, the classes in ULTRA (i.e., ultra.ranking_models) have a function named "build", which takes a list of documents (feature vectors) as inputs and output a list of ranking scores together. We design the API in this way to allow the building of multi-variate ranking functions such as DLCM and GSF. @anhtran1010 may know more details about it.

cmacdonald commented 3 years ago

which takes a list of documents (feature vectors) as inputs and output a list of ranking scores together

This yes is sufficient.

What about training - does your API look like xgboost/lightgbm?

PyTerrier is just wrappers for Pandas dataframes. We munge features, qrels into e.g. sklearn or LightGBM .fit() methods - e.g. see https://github.com/terrier-org/pyterrier/blob/master/pyterrier/ltr.py#L129

QingyaoAi commented 3 years ago

No, the current API is different from xgboost/lightgbm. However, I think it shouldn't be difficult to revise it to fit xgboost/lightgbm. We are considering adding support for LightGBM, but haven't done anything on this direction yet. We will definitely put it in our development agenda!

cmacdonald commented 3 years ago

Perhaps we can discuss to somehow make a demonstration Colab notebook. An initial version would be to train using your command line scripts, then re-rank using the learned model's build() function.

This notebook demonstrates LTR for TREC Covid test collection. It also shows John Foley's Fastrank in use. Perhaps it can be used as a starting point.

QingyaoAi commented 3 years ago

Sounds like a great plan! I will discuss it with @anhtran1010 and @Taosheng-ty to see how we can make it happen. Thanks a lot for the suggestion!