Project for UCL Information Retrieval 2016 Learning to Rank (LETOR)
Implementation of Active Learning for Ranking through Expected Loss Optimization
We are comparing two LETOR models adaRank, and LambdaMart, and then observing another approach to LETOR called ELO Active Learning.
The following runs AdaRank on our dataset, change -ranker to 6 to run LambdaMart
$ java -jar bin/RankLib.jar -train ../data/MQ2016/base1024/Fold1/train.txt -test ../data/MQ2016/active_learning/test.txt -validate ../data/MQ2016/base1024/Fold1/vali.txt -ranker 3 -metric2t DCG@10
Folds | Training Set | Validation Set | Test Set |
---|---|---|---|
Fold1 | {S1,S2,S3} | S4 | S5 |
Fold2 | {S2,S3,S4} | S5 | S1 |
Fold3 | {S3,S4,S5} | S1 | S2 |
Fold4 | {S4,S5,S1} | S2 | S3 |
Fold5 | {S5,S1,S2} | S3 | S4 |
$ java -jar bin/RankLib.jar -train ../data/MQ2008/Fold1/train.txt -test ../data/MQ2008/Fold1/test.txt -validate ../data/MQ2008/Fold1/vali.txt -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save mymodel.txt
.
├── data
| ├── MQ2016 # segmented MQ2007 data
| │ ├── S1.txt
| │ ├── S2.txt
| │ ├── S3.txt
| │ ├── S4.txt
| │ ├── S5.txt
| │ ├── base512 # segmented data
| │ ├── base1024
| │ ├── base2048
| │ ├── base4096
| │ ├── base8192
| │ ├── base16384
| │ ├── base32768
| │ ├── base65536
| │ └── active_learning/* # all pre-processed data
├── literature
├── poster
├── ranklib
├── report
├── results
└── active_learning # source code for active learning
├── __init__.py
├── constants.py
├── elo_active_learning.py
├── pre_processing.py
└── util.py