deepset-ai / COVID-QA

API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.
Apache License 2.0
344 stars 121 forks source link

Annotation methodology of QA resources #103

Open lintool opened 4 years ago

lintool commented 4 years ago

Hi there, thanks for sharing your QA resource! https://github.com/deepset-ai/COVID-QA/tree/master/data/question-answering

I was wondering if you have a write-up of the annotation methodology? For example, how were the documents selected, how were the questions generated, guidelines for marking the extent of the spans, etc.

Thanks in advance!

Timoeller commented 4 years ago

Hey @lintool thanks for looking into the annotations we open sourced. We really liked your work on BERTserini and How Dirk used OSS frameworks for a Cord 19 semantic search. Currently we are also working on better retrievers in our semantic search framework haystack.

About your question:

Can we somehow assist you in using these labels?

lintool commented 4 years ago

Hi @Timoeller - Thanks for your response. We've been working on building test collections also, but via slightly different approach: https://arxiv.org/abs/2004.11339

I was wondering if you'd be interested in more closely coordinating efforts? If so, let's connect directly over email?

tonyreina commented 4 years ago

Yes. We'd love to coordinate our efforts. Please reach out directly to either me (Tony) or Timo. Thanks so much.

lintool commented 4 years ago

What's your email? Or you can find mine on my website: https://cs.uwaterloo.ca/~jimmylin/index.html