R0Wi-DEV / workflow_ocr

This is a Nextcloud Workflow App which enables you to process files via OCR on serverside.
GNU Affero General Public License v3.0
79 stars 7 forks source link

[Feature] Custom arguments for ocrmypdf #272

Open XueSheng-GIT opened 1 week ago

XueSheng-GIT commented 1 week ago

Describe the feature

workflow_ocr provides a sane default of options/arguments for processing ocr (easy to use, which is good). But there are cases where those limited options are not enough. E.g. I want to rotate pages sometimes, which can be done by ocrmypdf but workflow_ocr does not provide an option for this. I also don't think it would make sense to add an additional option for each possible use case.

My idea would be to allow an advanced option so that a user can pass additional arguments to ocrmypdf as "per workflow" setting (free text). Thus, for additional rotation you could add:

-r --rotate-pages-threshold 8

Of course passing wrong arguments could break the workflow... thus, an advanced option which is disabled by default.

At least this would be a "clean" possibility to cover individual needs without overloading the available options.

R0Wi commented 1 week ago

I like this idea :+1: This would also give more flexibility to an admin who knows the exact version of the ocrmypdf being installed on the system, so that he could for example use commandline parameters which are only available in more recent versions.

I guess from a security point of view we should ensure that no malicious code can be injected. For example a parameter like rm -rf / should not be accepted (but would probably raise an error anyways since we pass it to ocrmypdf as params) :smile:

@bahnwaerter any thoughts?