KevM / tikaondotnet

Use the Java Tika text extraction library on the .NET platform
http://kevm.github.io/tikaondotnet/
Apache License 2.0
195 stars 73 forks source link

Failed to extract text from pdf. #136

Closed 5160jivan closed 2 years ago

5160jivan commented 5 years ago

I was trying to run a simple project with the package installed as a nuget. My code was like: TextExtractor textExtractor = new TextExtractor(); var pdfContents = textExtractor.Extract(@"files\sample.pdf"); Console.WriteLine(pdfContents.Text);

I get the following exceptions: TextExtractionException: Extraction failed. MissingMethodException: Method not found: 'Void System.IO.FileStream..ctor(System.String, System.IO.FileMode, System.Security.AccessControl.FileSystemRights, System.IO.FileShare, Int32, System.IO.FileOptions)'.

I looked at the test file and the test code for pdf file does similar thing, so I am not sure why it does not work. Thank you so much for your time.

KevM commented 5 years ago

Check out this issue #107

jfarbman commented 4 years ago

Not sure if this has been resolved, but I received this error creating a .NET core project instead of a .NET Framework project. The TikaOnDotNet libraries showed with an incompatibility warning in the solution. After creating a .NET Framework project, the errors went away.

KevM commented 4 years ago

IMVM the basis of the library, as this is a java port, is not dotnet core compatible.

On Fri, Feb 7, 2020 at 8:48 PM jfarbman notifications@github.com wrote:

Not sure if this has been resolved, but I received this error creating a .NET core project instead of a .NET Framework project. The TikaOnDotNet libraries showed with an incompatibility warning in the solution. After creating a .NET Framework project, the errors went away.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/KevM/tikaondotnet/issues/136?email_source=notifications&email_token=AAAAMHKAYPR6YCHH7YCRK4LRBYMQ3A5CNFSM4H5GS5JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELFHUNQ#issuecomment-583694902, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAMHJ2JAHQOVQXW5CVL3LRBYMQ3ANCNFSM4H5GS5JA .

--

xseine commented 2 years ago

@KevM Would this new version of IKVM work? https://github.com/ikvm-revived/ikvm/releases/tag/8.2.0-prerelease.392

KevM commented 2 years ago

152 Is tracking this