tabulapdf / tabula-java

Extract tables from PDF files
MIT License
1.77k stars 412 forks source link

update pdfbox to 3.0.1 #541

Open nanfangfanqie opened 4 months ago

jeremybmerrill commented 2 months ago

Hi @nanfangfanqie : Thank you for this PR and I'm sorry that we didn't respond to it earlier. Upgrading to pdfbox 3 seems worthwhile.

However, I'm running into a bizarre problem: when I try to run the tests on your branch (with mvn test)... 0 tests get executed. (You can see this yourself on the "Checks" tab here on Github.) I am sufficiently removed from the Java ecosystem to not immediately know how to fix this -- it obviously works on the current master branch. Do you have any ideas what could be causing this?

nanfangfanqie commented 2 months ago

@jeremybmerrill: Sorry i didn't check it, i fix it in the latest commit, just need exclude junit-jupiter from pdfbox, we can re-run the CI.

jeremybmerrill commented 2 months ago

Hi @nanfangfanqie Thank you for the update. Now the tests run. However, some of them fail due to out-of-memory or heap space, and some of them fail due to an issue with font comparison. Do you see these failures too? If so, are you able to address them?

nanfangfanqie commented 2 months ago

Hi @jeremybmerrill Regarding the two issues you mentioned, I solved them in the latest two commits and running the unit tests locally passed, sorry for not checking before.

jeremybmerrill commented 2 months ago

Yay! Thank you! The tests pass for me. I will take a deeper look soon, but this is looking very good. Thank you again for bearing with my requests and my rusty Java :)

THausherr commented 1 month ago

Commits are easier to evaluate if they haven't any formatting changes in unrelated code.

nanfangfanqie commented 1 month ago

This is my mistake. In fact most of the changes are related to how PDFs are loaded and how fonts are declared.

blacktek commented 1 week ago

Hello, any plan to merge this pull request? I notice pdfbox 3.0.2 was release too.

Tnx!

jeremybmerrill commented 1 week ago

@jazzido What do you think about merging? Tests pass. (And I guess re-uploading to Maven, I think you have the keys to that? I'm sure I don't.)

@blacktek thx for the reminder. Despite the name of the thread, this PR would upgrade us to 3.0.2.