py-pdf / pdfly

CLI tool to extract (meta)data from PDF and manipulate PDF files
BSD 3-Clause "New" or "Revised" License
109 stars 18 forks source link

ENH: Add cat from file functionality #56

Closed ebotiab closed 3 weeks ago

ebotiab commented 4 months ago

It could be more convinient in some cases to store the files to concatenate into a file e.g. a csv rather than passing them as arguments in the cat command, for example, if there are a lot of pdfs to merge. For this a possible new command could be:

cat-from-csv "pdfs_data_to_merge.csv"  --filescol "col_with_file_paths" --pagescol "col_with_pages_to_cat".
cyy-2024 commented 1 month ago

this is a good idea~I also think so.I will try on the issue. :)

ebotiab commented 1 month ago

I can try to create PR on this, if you want

Lucas-C commented 3 weeks ago

I'm not sure that this really needs to be embedded in pdfly...

Fur such task, it may be better to combine pdfly cat with a CSV-parsing command like xsv or csvkit, or simply to write a dedicated Python script.

And if pdfly starts supporting CSV parsing, it would make sense that it also starts supporting JSON parsing, YAML parsing, etc. This can quickly create feature creep.

Quoting The UNIX Philosophy by Mike Gancarz:

1- Small is beautiful. 2- Make each program do one thing well. ... 9- Make every program a filter.