hurshprasad / active-learning-elo-letor

Active Learning for Learning to Rank (LETOR)
8 stars 2 forks source link
active-learning learning-to-rank

Project for UCL Information Retrieval 2016 Learning to Rank (LETOR)

Implementation of Active Learning for Ranking through Expected Loss Optimization

We are comparing two LETOR models adaRank, and LambdaMart, and then observing another approach to LETOR called ELO Active Learning.

How to Run the code

Active Learning (ELO)

RankLib Models (AdaRank, LambdaMart)

The following runs AdaRank on our dataset, change -ranker to 6 to run LambdaMart

$ java -jar bin/RankLib.jar -train ../data/MQ2016/base1024/Fold1/train.txt -test ../data/MQ2016/active_learning/test.txt -validate ../data/MQ2016/base1024/Fold1/vali.txt -ranker 3 -metric2t DCG@10

Data

Folds Training Set Validation Set Test Set
Fold1 {S1,S2,S3} S4 S5
Fold2 {S2,S3,S4} S5 S1
Fold3 {S3,S4,S5} S1 S2
Fold4 {S4,S5,S1} S2 S3
Fold5 {S5,S1,S2} S3 S4

Frameworks

Folder Structure

.  
├── data 
|    ├── MQ2016                         # segmented MQ2007 data
|    │   ├── S1.txt
|    │   ├── S2.txt
|    │   ├── S3.txt
|    │   ├── S4.txt
|    │   ├── S5.txt
|    │   ├── base512                    # segmented data
|    │   ├── base1024
|    │   ├── base2048
|    │   ├── base4096
|    │   ├── base8192
|    │   ├── base16384
|    │   ├── base32768
|    │   ├── base65536
|    │   └── active_learning/*      # all pre-processed data
├── literature  
├── poster  
├── ranklib  
├── report  
├── results
└── active_learning                 # source code for active learning
    ├── __init__.py
    ├── constants.py  
    ├── elo_active_learning.py
    ├── pre_processing.py
    └── util.py