Open jw-ut opened 1 year ago
Yes ! So it should be added in the guidelines:
ignoring a document is normally only for documents that are problematic to annotate (no selectable text, systematic wrong sentences, not English, corrupted PDF), and it is better to reject entirely this document
if a document has no software mention, it should be validated as such. Here it means the document can be annotated, it is not corrupted, etc. It's also valuable to have examples of articles without mentions ("negative examples" for the machine learning model).
Apparently do not just "ignore doc" if there are no software mentions. Validate doc even if there are no software mentions. Please let us know if there are any further instructions, thanks!