Closed obradovicma closed 6 years ago
XMI CAS is the most common serialization of the UIMA CAS data structure. Together with a type system specification, it is capable of (de)serializing all data within the UIMA CAS data structure. There are usually only two reasons not to use it:
1) if for efficiency reasons or technical details a binary format is better suited 2) if the data to be annotated contains characters that cannot be encoded in XML files
In all other cases, XMI CAS is probably the best choice.
Does that answer your question?
Thank you very much! I agree with you and I'm thinking of really switching to UIMA XMI. To which extent is your code already usable for a development project as ourse?
@obradovicma We currently use PyCAS in to connect a Python-based neural-network classifier with the INCEpTION annotation tool. INCEpTION is a web-based annotation tool written in Java which can show annotation suggestions to the annotator - here we use the Java XMI CAS implementation provided as by the Apache UIMA project. The Python-based neural-network classifier is producing the annotation suggestions. The Python service receives CAS XMI data from INCEpTION, adds new annotations, and then sends the CAS XMI data back. For us, that's working ok. You'd have to try for yourself to see if it works for you too.
@Rentier @mromanello anything to add here?
Sounds very cool. Actually, we are doing something very similar in our NLP project. We will definitely try it out using PyCAS. INCEpTION could be very interesting for our project too...we will check it out! Thanks for your help!
@obradovicma Great :) If you find any bugs in PyCAS, please tell us. And if you make any fixes or improvements to the code, we would be very happy if you would contribute them to the project.
Hi, to which extend is your code already usable for an NLP project. We are currently stuck with UIMA JSON since we were thinking that there was a UIMA JSON reader for reimporting automatically pre-annotated documents to webanno. Now, we are thinking of switching to XMI. Do you see a disadvantage in switching from JSON to XMI? Thanks in advance!