Closed pascal-vaillant closed 2 years ago
INCEpTION does not support all of UIMAs types - e.g. FSList is not supported. Probably the whole type is ignored if a particular feature is not supported. Could you please try removing the FSList feature and try again?
Thanks Richard. I have tried changing FSList with FSArray (since I noticed that FSArray was used by INCEpTION's embedded types, to build a list of SemArgs), but the result is the same. Pascal
Le mer. 23 févr. 2022 à 15:27, Richard Eckart de Castilho < @.***> a écrit :
INCEpTION does not support all of UIMAs types - e.g. FSList is not supported. Probably the whole type is ignored if a particular feature is not supported. Could you please try removing the FSList feature and try again?
— Reply to this email directly, view it on GitHub https://github.com/inception-project/inception/issues/2868#issuecomment-1048837060, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIJT2WWAFGDVVCYHC5BHWDU4TVGRANCNFSM5PEPNMNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
When I set the log level for de.tudarmstadt.ukp.clarin.webanno.api.annotation.util.TypeSystemAnalysis
to TRACE
, I can see this in the log:
TypeSystemAnalysis - Analyzing [org.apache.ctakes.typesystem.type.syntax.Lemma]
TypeSystemAnalysis - [org.apache.ctakes.typesystem.type.syntax.Lemma] is not an annotation type. Skipping.
TypeSystemAnalysis - Analyzing [org.apache.ctakes.typesystem.type.syntax.BaseToken]
TypeSystemAnalysis - Unable to determine layer type for [org.apache.ctakes.typesystem.type.syntax.BaseToken]
TypeSystemAnalysis - Recognized 0 of 2 types as layers (0%)
When I remove the FSList feature from the org.apache.ctakes.typesystem.type.syntax.BaseToken
, it is recognized.
When I change the super-type of the lemma type to uima.tcas.Annotation
, it is also recognized.
Mind that INCEpTION only makes use of tokens of the type de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token
- if no tokens of that type exist in the CAS, then INCEpTION creates them. It also creates de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence
annotations if none exist.
Hi, Yes. The point is to be able to see "tokens" and "sentences" prefixed with org.apache.ctakes.typesystem.type, not to replace those prefixed by de.tudarmstadt.ukp.dkpro.core.api.*.type. Apache cTakes generates XMI CAS files including annotations, but also objects that are not in the source texts ("lemmas" belong to that type, but also medical ontology concepts for example). Then annotations may refer to those objects. I will try to fiddle with the cTakes type system to see how far I can twist it to import it into INCEpTION, and if I manage it I will post it here in case it is of use to other users. But perhaps this is not the right way to go (perhaps it is better to write a script that transforms "cTakes" tokens and sentences into "inception" tokens and sentences ? In the meantime I still do not understand why I cannot import the minimal type system XML file attached here (with "FSList" replaced by "FSArray"). Thanks a lot for your help anyway ! Pascal
Le mer. 23 févr. 2022 à 21:08, Richard Eckart de Castilho < @.***> a écrit :
TypeSystemAnalysis - Analyzing [org.apache.ctakes.typesystem.type.syntax.Lemma] TypeSystemAnalysis - [org.apache.ctakes.typesystem.type.syntax.Lemma] is not an annotation type. Skipping. TypeSystemAnalysis - Analyzing [org.apache.ctakes.typesystem.type.syntax.BaseToken] TypeSystemAnalysis - Unable to determine layer type for [org.apache.ctakes.typesystem.type.syntax.BaseToken] TypeSystemAnalysis - Recognized 0 of 2 types as layers (0%)
When I remove the FSList feature from the org.apache.ctakes.typesystem.type.syntax.BaseToken, it is recognized. When I change the super-type of the lemma type to uima.tcas.Annotation, it is also recognized.
Mind that INCEpTION only makes use of tokens of the type de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token - if no tokens of that type exist in the CAS, then INCEpTION creates them. It also creates de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence annotations if none exist.
— Reply to this email directly, view it on GitHub https://github.com/inception-project/inception/issues/2868#issuecomment-1049168453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIJT2SBDLJH6B7GQVT3GSTU4U5D7ANCNFSM5PEPNMNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
FSArray and FSList are not generically supported by INCEpTION.
If an annotation type meets the INCEpTION conventions for a "Relation" layer, it is recognized as such, otherwise types are considered to be "Span" layers and must inherit from the UIMA Annotation
type.
The primitive UIMA types boolean, integer, float and string are supported as features.
If a type follows the conventions for a "Link feature", it is recognized as such.
When #2862 is done, then StringArray
will be supported as well.
You might find cassis interesting if you want to transform cTAKES data into something more suitable for INCEpTION. It is a convenient Python library for working with XMI files.
Thanks for the tip ! Pascal
Le jeu. 24 févr. 2022 à 08:43, Richard Eckart de Castilho < @.***> a écrit :
You might find cassis https://github.com/dkpro/dkpro-cassis interesting if you want to transform cTAKES data into something more suitable for INCEpTION. It is a convenient Python library for working with XMI files.
— Reply to this email directly, view it on GitHub https://github.com/inception-project/inception/issues/2868#issuecomment-1049578035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGIJT2XHF4PVM3RDLJLYEATU4XOSVANCNFSM5PEPNMNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you authored the thread.Message ID: @.***>
Describe the bug Hello INCEpTION team ! I am trying to import annotations generated by the Apache cTakes "clinical pipeline" system (an annotation platform for biomedical texts in English) into INCEpTION, to be able to view them as a set of annotation layers. Apache cTakes is built on UIMA and uses an UIMA CAS XML type system definition. However, I can only partly import the XML type system description of cTakes. I include (below) a very minimal example of an annotation type that does not show up in INCEpTION.
To Reproduce Steps to reproduce the behavior:
Expected behavior The expected behaviour is that 'BaseToken' should appear as a new Layer in the list of Layers. However, it does not.
Please complete the following information:
XML source of the minimal example