nianlonggu / Local-Citation-Recommendation

Code for ECIR 2022 paper Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-based Reranking
22 stars 10 forks source link

Create new model file #2

Closed dinhngocthi closed 1 year ago

dinhngocthi commented 2 years ago

①If I want to run with the papers_others.json file that contains different data than the existing papers.json file, then I have to run the train.py file to create a new model_batch_91170.pt file, right?

②There is this line in the file training.config "train_corpus_path":"../../data/acl/train_with_prefetched_ids.json", I understand that "train_with_prefetched_ids.json" is train.json file right?

nianlonggu commented 1 year ago

Hi,

for 1), yes.
for 2) ../../data/acl/train_with_prefetched_ids.json is what we get after using get_prefetched_ids.py to obtain the prefetched ids using the ../../data/acl/train.json as input.

I have updated and cleaned the code especially for the prefetching part. You can simply run the whole pipeline on google colab: https://colab.research.google.com/github/nianlonggu/Local-Citation-Recommendation/blob/main/Turorial_Local_Citation_Recommendation.ipynb