nyu-dl / dl4marco-bert

BSD 3-Clause "New" or "Revised" License
476 stars 87 forks source link

Error when retrieval with CAR #36

Open canjiali opened 4 years ago

canjiali commented 4 years ago

Hey, I've indexed the dataset. When I use the topics you provided for retrieval, an error occurs for the training/dev set (test set works well):

2019-12-11 21:57:43,459 INFO  [main] search.SearchCollection (SearchCollection.java:212) - Reading index at /data/index/lucene-index.car17v2.0.pos+docvectors+rawdocs
2019-12-11 21:57:43,659 INFO  [main] search.SearchCollection (SearchCollection.java:239) - Use Bag of Terms query
java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 0 in: "%2"
        at java.base/java.net.URLDecoder.decode(URLDecoder.java:232)
        at java.base/java.net.URLDecoder.decode(URLDecoder.java:142)
        at io.anserini.search.topicreader.CarTopicReader.read(CarTopicReader.java:47)
        at io.anserini.search.topicreader.TopicReader.read(TopicReader.java:58)
        at io.anserini.search.SearchCollection.runTopics(SearchCollection.java:373)
        at io.anserini.search.SearchCollection.main(SearchCollection.java:559)
Exception in thread "main" java.lang.IllegalArgumentException: Unable to load topic reader: Car
        at io.anserini.search.SearchCollection.runTopics(SearchCollection.java:376)
        at io.anserini.search.SearchCollection.main(SearchCollection.java:559)

I guess the problem is the different format between train/dev and test set since anserini fails to parae the train/ dev queries. The top 10 lines of dev set is

enwiki:0
enwiki:00
enwiki:0-0-0
enwiki:007:%20Agent%20Under%20Fire
enwiki:007:%20Agent%20Under%20Fire/Development
enwiki:007:%20Agent%20Under%20Fire/Gameplay
enwiki:007:%20Agent%20Under%20Fire/Plot
enwiki:007:%20Agent%20Under%20Fire/Reception
enwiki:02.005%20Fighter%20Squadron%20%22%C3%8Ele-de-France%22
enwiki:02.005%20Fighter%20Squadron%20%22%C3%8Ele-de-France%22/Bases

while the top 10 lines of test set is

enwiki:Aftertaste
enwiki:Aftertaste/Aftertaste%20processing%20in%20the%20cerebral%20cortex
enwiki:Aftertaste/Distinguishing%20aftertaste%20and%20flavor
enwiki:Aftertaste/Foods%20with%20distinct%20aftertastes/Artificial%20sweeteners
enwiki:Aftertaste/Foods%20with%20distinct%20aftertastes/Wine
enwiki:Aftertaste/Taste%20receptor%20dynamics
enwiki:Aftertaste/Temporal%20taste%20perception
enwiki:Aftertaste/Temporal%20taste%20perception/Variability%20of%20human%20taste%20perception

could you help with this? thank you!