Hegghammer / daiR

R package for Google Document AI
https://dair.info/
Other
41 stars 4 forks source link

Problem in draw_blocks #6

Closed pauvallprat closed 1 year ago

pauvallprat commented 1 year ago

Hi Thomas, First of all thanks for the package and for the very detailed vignettes on how to use it. I have encountered a problem when using the draw_blocks(). I get the following error: Error in base64enc::base64decode(page_imgs[i], outconn) : I can only decode base64 strings

Here is the code I use:

resp <- gcs_upload("image1.pdf")
resp1 <- dai_async_tab("image1.pdf", bucket = bucket,
                       dest_folder = "process_json/")
our_file <- "process_json/image1.pdf-output-page-1-to-1.json"
json_file <- "process_json/image1.json"
gcs_get_object(our_file, 
               saveToDisk = json_file,
               overwrite = TRUE)

draw_blocks(json_file, dir = tempdir())

Do you know what causes this problem?

Thanks in advance!

Hegghammer commented 1 year ago

Apologies for the late response. I suspect this is because json_file does not contain a full path. It works for downloading the file from the bucket, but not when you pass it as a path to draw_blocks(). Try draw_blocks("</FULL/PATH/TO>/image1.json", dir = tempdir()).

Hegghammer commented 1 year ago

I had a closer look, and the problem was that the function until recently only worked on output from dai_async(), not fromdai_async_tab(). However, I have now rewritten the draw_* functions so that they work with output from all processing functions. If you download the development version (devtools::install_github("hegghammer/daiR")) it should work. Just make sure to study the new syntax for the draw_* functions (https://github.com/Hegghammer/daiR/blob/master/R/inspect_output.R). I hope to submit a revised package version with updated documentation to CRAN later this autumn.

pauvallprat commented 1 year ago

Great, thanks a lot for the answer and for updating the package!!