KorAP / KorAP-XML-TEI

Conversion of TEI P5 based formats to KorAP-XML
BSD 2-Clause "Simplified" License
2 stars 0 forks source link

tei2korapxml fails for German ICC contribution #4

Closed kupietz closed 1 year ago

kupietz commented 1 year ago

Error message:

Maybe empty textSigle => skipping this text ...
data=
kupietz commented 1 year ago

Maybe allow IDs/sigles also as id attributes and provide a parameter to pass a regular expression for splitting it into corpus/doc/text-parts?

Akron commented 1 year ago

Can you provide an example file as part of the test suite?

kupietz commented 1 year ago

https://korap.ids-mannheim.de/gerrit/c/KorAP/KorAP-XML-TEI/+/6473/1/t/data/icc_german_sample.p5.xml

I have already prepared a fix.