pridiltal / staplr

PDF Toolkit. :paperclip: :hammer: :wrench: :scissors: :bookmark_tabs: :file_folder::paperclip: :bookmark: :construction: :construction_worker:
https://pridiltal.github.io/staplr/
264 stars 27 forks source link

split_pdf on Windows PC #14

Open pohndorff opened 6 years ago

pohndorff commented 6 years ago

I have an issue migrating my code from MacOS to Windows devices. The following code snippet works just fine on my MacOS but hangs itself after the first iteration:

include
path <- list.files(path = paste0(getwd(),"/raw"),
                   pattern = "*.pdf",
                   recursive = TRUE,
                   full.names =  TRUE)
file <- list.files(path = paste0(getwd(),"/raw"),
                   pattern = "*.pdf",
                   recursive = TRUE)

for(i in 1:length(file)){
  file.path <- str_extract(file[i],".*(?=(KV[0-9]+\\.pdf))")
  dir.create(file.path(paste0(getwd(),"/output/",file.path)), showWarnings = FALSE)
  split_pdf(input_filepath = path[i],
            output_directory = file.path,
            prefix = paste0("p","_"))
}

I installed pdftk on both devices and they work fine, so I guess it's an issue regarding staplr.

Any suggestions to quickly solve the issue?

oganm commented 6 years ago

Probably related to slash directions on the filepath

pohndorff commented 6 years ago

Yeah, stupid me 🤦‍♂️ Can be closed now.

oganm commented 6 years ago

no no. it's our fault if R/unix style slashes break things on windows. I just wrote it as the potential issue to return for a fix later

oganm commented 6 years ago

About it hanging itself, does it happen on the mac? See #7 if it does.

About the slash issue, were you actually able to fix it by changing slash directions? Because I can't replicate the issue on my windows

pohndorff commented 6 years ago

The infinite loop only occurred on Windows so far.

Regarding the solution: I used the file.path() function on it so it autocorrected on any OS. My misconception was due to lacking documentation. I’d suggest to describe to attribute types more precisely. For this take reference to Hadley Wickhams book on R packages http://r-pkgs.had.co.nz/man.html#man-functions

oganm commented 6 years ago

Could you give me the path that failed. Because based on my experimentation in windows, it works with any combination of \s and /s.

I tried to replicate your code by making some assumptions about your file structure. I placed a pdf at the path raw/dir/KV111.pdf

Based on this structure your code doesn't and shouldn't work regardless of the OS as it is trying to place the output at dir directory which doesn't exist because the directory it created is actually called output/dir (assuming output directory already exists as recursive = FALSE in dir.create). I imagine some pieces were lost during the copy paste process here but it's making it hard for me to understand what exactly didn't work in the first place.

If I understand what went wrong to begin with I can see if some explanation could be added to the docs