4lex4 / scantailor-advanced

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.
GNU General Public License v3.0
1.15k stars 128 forks source link

Integrate export and multi-page doccument assembly from ST Universal #155

Closed mara004 closed 3 years ago

mara004 commented 3 years ago

Another feature which the Universal branch added that I would believe to be quite useful or important is being able to export a multi-page document from ScanTailor, rather than needing to use third party software afterwards. In my opinion, it is something that should be integrated into ScanTailor, beacuse it's quicker and easier for users. I wonder whether it's difficult to port this to STA? I think Universal can only export to (multi-page) tiff, but I would also consider it interesting to be able to export to JPEG / PNG or (multi-page) PDF, too.

Piolie commented 3 years ago

This can easily be achieved with any free image batch processing tool, such as ImageMagick. For PDF output: magick convert *.tif out.pdf. You could also use tesseract to add an OCR layer.

I think it's not worth the hassle adding it to ST.

mara004 commented 3 years ago

I agree there are more important problems to work on in STA. On Linux, I am already using the tiff2pdf command-line utility from libtiff-tools for creating finished documents. However, it feels awkward to me that additional software is needed, especially on Windows. I don't really want to install a full imagemagick only to assemble the STA output to multi-page PDFs. I think most users would expect to find this functionality in STA directly. It would be a lot easier to use. The processing workflow I would prefer is without an output directory: Just load the input files, do the editing and when finished, call an export dialog from Tools->Export which lets you choose between different formats (jpeg/png/tiff/pdf) and multi-page/single-page. What I would also like is an option to create a PDF which has a two pages on one layout.

Piolie commented 3 years ago

ST could include other related operations (scanning, OCR and document assembly come to mind). However, being still somewhat unstable and essentially a one-man development effort, I think it'd be wiser to concentrate on core functionality —especially that not covered by other free software, and bug fixing.

I agree that the UI could use some improvement, particularly on the documentation side. Maybe this is something us non-programmer users could help with.

For the record: there are official portable distributions of ImageMagick for Windows, so technically it is not necessary to install it.

hfiguiere commented 3 years ago

Good arguments for integrating PDF output would be as follow:

But there are things that would need to be done too:

All of this saves so much time by removing the step to generate the images and then going to a different tool to generate the output.