Open tomsutch opened 1 month ago
@tomsutch thx for reporting this I can fix it next week
@tomsutch it took me longer than expected but I think I was able to solve it
hi @tomsutch just following up did the last commit solve the issue?
Hi, thanks for looking into this! I can't see a new commit here - please could you point me to it?
Hi, thanks for looking into this! I can't see a new commit here - please could you point me to it?
sorry, i realize i never pushed the commit
i did it now in dev/
but I realize that it fails on ubuntu but worked on windows when i set utf-8
hola @jazzido
@tomsutch found this very interesting case that I can't solve "universally"
do you have any clues?
I added my test to reproduce the error here https://github.com/ropensci/tabulapdf/blob/main/dev/test-special_characters.R
and the file here https://github.com/ropensci/tabulapdf/blob/main/inst/examples/xbar.pdf
Description
Fatal R error when attempting to use
extract_text
on a PDF that includes $\bar{x}$. There's no error message, R just terminates.Reproducible example
I have constructed a simple example PDF, attached xbar.pdf, that gives the error. (I made this using Microsoft Word, inserting the $x$ and $\bar{x}$ using the equation editor, then saving to PDF.)
As this crashes R I can't use the
reprex
package for this, as far as I know...Note that if I call the
tabula.jar
bundled with the R package directly from the command line like thisjava -jar C:\Users\<username>\AppData\Local\R\win-library\4.4\tabulapdf\java\tabula.jar xbar.pdf
I get the following output (which is fine for my purposes - I am not particularly concerned about the $\bar{x}$ rendering properly, I just don't want the R session to crash):
Expected result
No fatal error: I would expect any issues with reading/rendering the $\bar{x}$ to result in a fallback like putting in '??' or similar.
Session info