khui / copacrr

The code for COPACRR Neural IR model.
Apache License 2.0
38 stars 11 forks source link

Context vectors #9

Closed alhassan-fadel closed 6 years ago

alhassan-fadel commented 6 years ago

Hi all, Could you please share the context vectors data to be able to run with context option?

Additionally, in your papers, you haven't mentioned anything about how do you constructed the IDF vectors for each query. I think it is very important for anyone who wants to develop this model to know where or how did you get it.

Thanks,

AbhinavMadahar commented 6 years ago

It's been a while since I worked on this project, so I might be mistaken, but I'm pretty sure that the context vectors are generated by the model based on the input data, so you don't need to download it separately.

alhassan-fadel commented 6 years ago

Thanks for your reply When I run the training script with context option is true I get this error:

File "utils/utils.py", line 470, in load_train_data_generator dim_sim=SIM_DIM, max_query_term=MAX_QUERY_LENGTH, n_grams=mat_ngrams, context=CONTEXT) File "utils/utils.py", line 151, in convert_cwid_udim_simmat pickle.load(open('%s/%s.p' % (contextdir, qid), 'rb')).items()} IOError: [Errno 2] No such file or directory: '...../PACRR/cosine/context/1.p'

andrewyates commented 6 years ago

The IDF vectors are vectors of size |q| containing the IDFs of q's terms as calculated by Terrier over the document collection. i.e., <IDF(q_1), IDF(q2), ..., IDF(q|q|)>

You can find the context vectors here. I don't think this pipeline can generate them right now.

alhassan-fadel commented 6 years ago

Thanks so much @andrewyates That was very helpful