forTEXT / catma

Computer Assisted Text Markup and Analysis
https://www.catma.de
GNU General Public License v3.0
88 stars 8 forks source link

tika by default cuts off after 100k chars #208

Closed mpetris closed 4 years ago

mpetris commented 4 years ago

needs to be configured to parseToString the whole text...