OHNLP / MedTator

A Serverless Text Annotation Tool for Corpus Development
https://ohnlp.github.io/MedTator/
Apache License 2.0
50 stars 18 forks source link

On the maximum length of "_ VALUE_TYPE _ " #4

Open glacierck opened 2 years ago

glacierck commented 2 years ago

I have an attribute about "country", which has about 200 enumeration values. This will make the DTD difficult to read. Is there any good solution to this kind of attribute type.

hehuan2112 commented 2 years ago

Thank you for using MedTator!
The annotation schema design is always challenging, and the attribute design is also difficult.

The first idea may be to use a good editor and enable the word wrap to remove the horizontal scroll bar. :p Then, the annotation guideline design may also be updated to reduce the burden on annotators. I guess it will be very difficult for annotators to select one value exactly from a huge list, and it may result in annotation errors easily. So, the huge attribute can be split into several sub-attributes.

For example, the country can be split into Asia, Americas, Africa, etc.

In addition, we are also considering using new format for the schema file, such as YAML or JSON since they are easier to edit and use.