pridiltal / staplr

PDF Toolkit. :paperclip: :hammer: :wrench: :scissors: :bookmark_tabs: :file_folder::paperclip: :bookmark: :construction: :construction_worker:
https://pridiltal.github.io/staplr/
264 stars 27 forks source link

Filling forms #2

Closed oganm closed 6 years ago

oganm commented 6 years ago

Were you planning to wrap the form filling functions of pdftk (generate_fdf and fill_form commands)? I could work on those on a spare time but was wondering if you had any ideas on how they should work.

generate_fdf command creates a plain text file with field names of a pdf form. Given an fdf file and a pdf, fill_form command fills the pdf.

A simple interface would be writing a getFields function that'll execute generate_fdf, parse that file and return a list of fields and their current vaules. Then a setFields function will take that list as an input and write it back to the file.

pridiltal commented 6 years ago

Yes. This would be a useful addition.

oganm commented 6 years ago

I was working on this while trying to preserve your style not to disrupt anything but in one case I wasn't sure how conservative I should be.

Here for instance you have a command that creates a single file yet you ask for the file path and file name separately instead of just accepting the file path as a single input. Do you want this pattern to remain for all file outputs?

pridiltal commented 6 years ago

If one needs to provide a specific name for the generated file (trimmed pdf file) 'output_filename' argument allows it to do so. If you have a better suggestion I am happy to change the format.

oganm commented 6 years ago

Personally I don't see much benefit in having a default filename with a file path that has to be provided (as a window will open asking for it if it is null). It seems more intuitive to me to directly input a filename. A reasonable default could be something like

output_filepath = paste0(tools::file_path_sans_ext(input_filepath), "_trimmed.pdf")

So by default it'll just save it next to the input without asking for anything. If you want to change name or the path, you just write the whole thing, or set it to NULL to open the save file dialog window

pridiltal commented 6 years ago

Thanks for the explanation and I also prefer that way.

But according to CRAN policies our functions should not write by default in the user's home filespace. That is not allow by CRAN policies. We can only write/save files if the user has specified a directory.

I like your suggestion to open a save file dialog window.

oganm commented 6 years ago

Got stuck while dealing with regular expressions due to some weird encoding issues of the FDF output. Was trying to deal with this without expanding the dependency tree but couldn't find a workaround using base R yet. Would you mind if I added stringi or stringr as a dependency?

pridiltal commented 6 years ago

Yes that's totally fine.

pridiltal commented 6 years ago

Thanks Oganm. This is a great addition to the package. I'll update the readme file soon

oganm commented 6 years ago

Do you also want to change how the output_directory/output_filename combinations work in your other functions?

Since the package is already published it might be problematic to change the way those work depending on how popular it already is but you can introduce a output_filepath and throw a deprecation warning when people use output_directory/filename combination. Or keep output_filename and just make the output_directory an optional variable that does nothing if NULL.

If you want to go one way or another I can look into it on a spare time or leave it to you.

pridiltal commented 6 years ago

Hi Oganm thank you very much for this suggestion. One of the features that I would like to maintain is to allow the users to select the files and directories interactively. I would like to go with the second option. What to you recommend? I am planing to add few more functions pleas let me know your preference too. We can follow the same structure.

oganm commented 6 years ago

"Files and directories" or "files or directories"? Option 2 which would have been my choice as well does "files or directories" as it allows setting the directory with GUI with a default filename or setting the file without any defaults though I am unsure if it makes sense to preserve that structure for the new functions (as in a new function with both output_filepath and output_directory/output_filename combo as the input, the existing functions will have this combination but new functions will be simpler if they only had output_filepath)

In use cases I can think of, it'll be redundant to enable people to use the GUI for either file path or the file itself. I think picking the directory only makes sense if the input (or output) itself is a directory, as in staple_pdf

as a side note, it'll probably be helpful if staple_pdf allowed individual filenames as an input since a directory does not allow control over how the files are ordered and it forces you to jump through some hoops if you want to merge files from arbitrary locations in a given order.

sorry for the unsolicited backlog. I'll try to make future recommendations in the form of branches ready to be merged. just didn't want to work on structural changes without consent

pridiltal commented 6 years ago

Thank you very much for the suggestion. Yes staple_pdf should allow individual filenames as input. I'll work on this. I edit functions giving output_filepath as the input. Need to add warning messages for output_directory/filename combination in the previous functions. I am happy to go with your suggestion of using output_filepath as the only input.

oganm commented 6 years ago

Fill pdf currently doesn't support "\"s added to text fields as they also act as regex escape characters. Need to escape them elegantly