inception-project / inception

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.
https://inception-project.github.io
Apache License 2.0
593 stars 151 forks source link

Upload UIMA-CAS split to lines #3243

Closed yanirmr closed 2 years ago

yanirmr commented 2 years ago

Describe the bug When uploading a UIMA-type file, lines are not divided as expected.

To Reproduce

  1. Upload a UIMA-CAS file with a newline character ("\n") via import documents
  2. Open the file in annotation mode
  3. Look at the line counter

Expected behavior Different line after each newline character.

Please complete the following information:

Additional context

  1. In plain text, newline characters behave as expected.
  2. Attempts to overcome this issue include splitting the lines into different sofas (via different views. I'm not sure if it is a good idea, but the views don't appear in the inception, only the original CAS.

Your support is greatly appreciated, thank you!

reckart commented 2 years ago

INCEpTION does not support multiple views.

If you want to pre-split sentences, you add annotations of type de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence to your CAS.

After import, I recommend to use the CAS Doctor from the project settings to check the imported files for validity.

reckart commented 2 years ago

Also, if you want to see lines separated by \n, you have to use an editor like brat (line-oriented) or brat (wrap at 120 chars) but not brat (sentence-oriented).

yanirmr commented 2 years ago

'de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence' works for me. Thank you!