KevM / tikaondotnet

Use the Java Tika text extraction library on the .NET platform
http://kevm.github.io/tikaondotnet/
Apache License 2.0
195 stars 73 forks source link

how can I extract text of only the body in a Word Document (.doc) (not title, author etc)? #153

Closed arous2005 closed 1 year ago

KevM commented 1 year ago

That sounds like a concern of Tika. Not sure how to tell that library to only extract from the body of a word doc. I think Tika uses Poi under the hood.