malteos / scincl

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)
http://arxiv.org/abs/2202.06671
MIT License
64 stars 1 forks source link

training data w/o leakage #2

Closed ronaldseoh closed 2 years ago

ronaldseoh commented 2 years ago

Hi Malte,

Thank you so much for making all your experimental codes and data files publicly available!

Just to confirm, are the data files on the releases page the replicated SPECTER training data with leakage? As long as it's not too much hassle for you, could you please upload the one without leakage as well? Thank you in advance!!

malteos commented 2 years ago

Thanks for asking. I've added another release for without leakage: https://github.com/malteos/scincl/releases/tag/0.1-wol