AdeDZY / SIGIR19-BERT-IR

Repo of code and data for SIGIR-19 short paper "Deeper Text Understanding for IR with Contextual NeuralLanguage Modeling"
BSD 3-Clause "New" or "Revised" License
162 stars 37 forks source link

Number of predict examples is not correct #2

Closed bayou3 closed 5 years ago

bayou3 commented 5 years ago

Hi In the Inference phase, there is a line as followed predict_examples = processor.get_test_examples(TASK_DATA_DIR) My path of TASK_DATA_DIR and xxx.trec.with_json file are all correct. In the function of get_test_examples(self, data_dir), every line has no error, the json resolution also goes well with the line of " json_dict = json.loads('#'.join(items[1:]))".

But, in "examples.append( InputExample(guid=guid, text_a_list=q_text_list, text_b=d, label=label) )", the examples only gets 1000 lines in my xxx.trec.with_json, which has 9000 lines. Can I ask that why it just obtains 1000 lines ?

AdeDZY commented 5 years ago

Hi, I guess the problem is related to processor's max_test_depth. In the data processor, I set: self.max_test_depth = 100 # for testing, we re-rank the top 100 results

Which means that for every query, my model only reads the first 100 documents to re-rank.

bayou3 commented 5 years ago

Thank you for pointing it out. Every line of prediction output file has two values, such as "0.9953881 0.0046119345". And I saw you provide a script "bert_doc_result_to_trec.py" for making aligning the output with the document/passage ids. In this script, it uses "float(line.split('\t')[1])" to extract the second value of an output line, can I ask that what the two values are and why use the last one?

AdeDZY commented 5 years ago

Hi, the two values are: the probability of being irrelevant (class=0), and the probability of being relevant (class=1). I use the second as the final estimated relevant score.

bayou3 commented 5 years ago

Very clear! Thank you so much!