KevM / tikaondotnet

Use the Java Tika text extraction library on the .NET platform
http://kevm.github.io/tikaondotnet/
Apache License 2.0
195 stars 73 forks source link

Tika not extracting table with Content control fields from word document #125

Open mistakenjockey opened 5 years ago

mistakenjockey commented 5 years ago

Hi, I have a word document which contains normal tables and there are some table with content control. Tika extract's the text of document and content of normal table perfectly but skip the table which has content control over it. How to extract the data from table with content control .

"Content controls are individual controls that you can add and customize for use in templates, forms, and documents. "

KevM commented 5 years ago

Sorry you are having problems. That part of Tika (Office document extraction) is controlled by POI. I'd take a look over there to see if they support the desired capability.

mistakenjockey commented 5 years ago

Thanks for the reply. keep posted if you find something which can resolve the issue.