AdeDZY / DeepCT

DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.
BSD 3-Clause "New" or "Revised" License
312 stars 46 forks source link

Why the text of a passage from msmarco is called `title`, and what does the field `position` of a doc mean? #13

Open KkEeVvIiNnn opened 3 years ago

KkEeVvIiNnn commented 3 years ago

And why training DeepCT should have field query which is not used in code run_deepct.py. If doctext and term_recall dict are already enough?

AdeDZY commented 3 years ago

Sorry about the confusing field names!

"query" is not used, and "term_recall" is sufficient.

I put document into "title" for some legacy issue from experiments with other datasets.