cmacdonald / pyterrier_bert

7 stars 8 forks source link

CEDR with SDM and BM25 #6

Open Tooba-ts1700550 opened 3 years ago

Tooba-ts1700550 commented 3 years ago

I am trying to use SDM and BM25 with CEDR, by following the docs, but I think I'm missing something.

SDM = pt.rewrite.SDM()
BM25 = pt.BatchRetrieve(indexref, controls={"wmodel" : "BM25"}, verbose=True, metadata=["docno", "text"])

SDM_BM25_cedrpipe = SDM >> BM25 >> CEDRPipeline(max_valid_rank=20)
# training, this uses validation set to apply early stopping
SDM_BM25_cedrpipe.fit(topicsTrain, qrelsTrain, topicsValid, qrelsTrain)

pt.pipelines.Experiment(topicsTest, 
                        [SDM, SDM_BM25_cedrpipe, DLM_qe_cedrpipe],
                        ["map", "recip_rank", "P.10", "ndcg_cut.10", "mrt"], 
                        qrelsTest, 
                        names=["SDM", "SDM_BM25_cedrpipe"])

I get this error:

File "<ipython-input-14-684c058c1417>", line 6, in <module>
    SDM_BM25_cedrpipe.fit(topicsTrain, qrelsTrain, topicsValid, qrelsTrain)

  File "/home/anaconda3/lib/python3.7/site-packages/pyterrier/transformer.py", line 779, in fit
    m.fit(topics_or_res_tr, qrels_tr, topics_or_res_va, qrels_va)

  File "/home/anaconda3/lib/python3.7/site-packages/pyterrier_bert/pyt_cedr.py", line 55, in fit
    valid_run = self._make_cedr_run(va, qrelsValid)

  File "/home/anaconda3/lib/python3.7/site-packages/pyterrier_bert/pyt_cedr.py", line 27, in _make_cedr_run
    final_DF = add_label_column(run_df, qrels_df)

  File "/home/anaconda3/lib/python3.7/site-packages/pyterrier_bert/__init__.py", line 21, in add_label_column
    raise ValueError("No queries with relevant documents")

ValueError: No queries with relevant documents
cmacdonald commented 3 years ago

Error seems clear to me.

Separately, see https://pyterrier.readthedocs.io/en/latest/rewrite.html#resetting-the-query-formulation

Tooba-ts1700550 commented 3 years ago

Just running the 3 lines of code in the docs linked, I am not sure why it cannot find reset() ?

AttributeError: module 'pyterrier.rewrite' has no attribute 'reset'

Just a side note, I think in the docs it should be:

pyterrier_bert.pyt_cedr.CEDRPipeline()

cmacdonald commented 3 years ago

Sorry, you are right, I didnt roll it into a release yet. Try the alternative install from github: https://pyterrier.readthedocs.io/en/latest/installation.html#installation