KevM / tikaondotnet

Use the Java Tika text extraction library on the .NET platform
http://kevm.github.io/tikaondotnet/
Apache License 2.0
197 stars 72 forks source link

Error when trying to extract. #100

Closed ismailmayat closed 2 years ago

ismailmayat commented 7 years ago

When i try to extract pdf or word (only test those so far) I get the following error:

   at javax.xml.transform.FactoryFinder.newInstance(Class , String , ClassLoader , Boolean , Boolean )
   at javax.xml.transform.FactoryFinder.find(Class , String )
   at javax.xml.transform.TransformerFactory.newInstance()
   at TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.GetTransformerHandler(Stream outputStream) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\Stream\StreamTextExtractor.cs:line 49
   at TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\Stream\StreamTextExtractor.cs:line 21
   --- End of inner exception stack trace ---
   at TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\Stream\StreamTextExtractor.cs:line 43
   at TikaOnDotNet.TextExtraction.TextExtractor.Extract(Func`2 streamFactory) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:line 53
   at TikaOnDotNet.TextExtraction.TextExtractor.Extract(String filePath) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:line 19
   --- End of inner exception stack trace ---
   at TikaOnDotNet.TextExtraction.TextExtractor.Extract(String filePath) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:line 28

Am i missing something? I installed via nuget

KevM commented 2 years ago

Let me know if you have a test or file we can try to duplicate this.