Open DanqingZ opened 3 years ago
Thanks for asking!
You should add an IF-Statementif not tmp_impossible:
before this line
https://github.com/ShannonAI/mrc-for-flat-nested-ner/blob/master/data_preprocess/generate_mrc_dataset.py#L91
.
After that, run script/data/gen_mrc_ner_datasets.sh
and only (context, question, answer) pairs with entities in the context will be saved.
tmp_impossible
is True
denotes that no entities are in the context.
Otherwise, at least one entity with the entity type query
exists in the context.
I hope this clarifies your question.
thank you for answering my question. I am actually confused about the following questions:
Hi, I am looking into the data (for example conll2003). Although for different entity, we can generate different context, query pair. For each context, we have to generate the (context, question, answer) for all entities. I am wondering if I have partially annotated training data, can I only generate (context, question, answer) when we have the entity in the context?
Thank you!