MatthiasValvekens / pyHanko

pyHanko: sign and stamp PDF files
MIT License
460 stars 68 forks source link

Allow PKCS#11 token settings to be read from the configuration file #18

Closed FernandoJCabral closed 3 years ago

FernandoJCabral commented 3 years ago

Hi, Matthias. I am trying to sign a PDF but encountered the following: Command: pyhanko sign addsig --field Sig1 beid --lib /usr/lib/libaetpkss.so ii.pdf o.pdf

Error: (several lines eliminated...)

raise PKCS11Error(
pkcs11.exceptions.PKCS11Error: Could not find (unique) cert with label 'Root'.
Error: Generic processing error.

Perhaps the issue has to do withbeid, but I don't know what to put in place.

Also, It would be very good if you could point me towards a sample code the allows me to sign a PDF using the API instead of using the CLI.

Thank you.

Fernando Cabral

MatthiasValvekens commented 3 years ago

Hi Fernando, thanks for your interest in pyHanko!

The beid subcommand is a convenience wrapper specifically for Belgian eID cards, and presumably your PKCS#11 token has a different layout. You'll want to use the pkcs11 subcommand instead.

$ pyhanko sign addsig pkcs11 --help
Usage: pyhanko sign addsig pkcs11 [OPTIONS] INFILE OUTFILE

  use generic PKCS#11 device to sign

Options:
  --lib FILE          path to PKCS#11 module  [required]
  --token-label TEXT  PKCS#11 token label  [required]
  --cert-label TEXT   certificate label  [required]
  --key-label TEXT    key label
  --slot-no INTEGER   specify PKCS#11 slot to use
  --skip-user-pin     do not prompt for PIN (e.g. if the token has a PIN pad)
                      [default: False]

  --help              Show this message and exit.

It's not quite as "batteries included" as beid, but with the help of a PKCS#11 inspection tool (like this one) you should be able to piece together which parameters to pass to --cert-label, --token-label and possibly --key-label. Let me know how that goes.

I know that pyHanko's documentation on generic PKCS#11 devices is a bit lackluster, I'll have to do something about that at some point.

MatthiasValvekens commented 3 years ago

Sorry, I didn't notice your other question about API usage.

For more information on signing files using the pyHanko API in general, take a look at the relevant section in the docs. For PKCS#11-specific issues, you might want to get some inspiration from the tests for the PKCS#11 signing code, and the API reference docs for PKCS11Signer.

FernandoJCabral commented 3 years ago

Matthias, thank you for your help.

I wonder if you have a pointer to PKCS11Signer. It is refered to in the paragraph reproduced bellow, but there is no link, as is one to SimpleSigner (which is not about PKCS11).

Providing detailed guidance on how to implement your own Signer subclass is beyond the scope of this guide—the implementations of SimpleSigner and PKCS11Signer should help. This subsection merely highlights some of the issues you should keep in mind.

Thank you.

Fernando Cabral

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Em segunda-feira, 7 de junho de 2021 às 16:09, Matthias Valvekens @.***> escreveu:

Sorry, I didn't notice your other question about API usage.

For more information on signing files using the pyHanko API in general, take a look at the relevant section in the docs. For PKCS#11-specific issues, you might want to get some inspiration from the tests for the PKCS#11 signing code, and the API reference docs for PKCS11Signer.

You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub, or unsubscribe.[https://github.com/notifications/beacon/AQS4BGQ7HPEMIZCXVSMR5PDTRUKPZA5CNFSM46IHVLI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGMEGKHA.gif]

MatthiasValvekens commented 3 years ago

Oops, that's probably a typo in the rst file from which that page in the docs was generated, I'll fix that.

Here's a link to the API reference for PKCS11Signer: https://pyhanko.readthedocs.io/en/latest/api-docs/pyhanko.sign.pkcs11.html. Looking at the tests might also be helpful.

FernandoJCabral commented 3 years ago

Hi Matthias, looking for simpler things to start using pyHanko, I tried to check the signature with the commando pyhanko sign validate a.pdf. Nevertheless, I got the messages I reproduce bellow. The same signature has been correctly found and identified by okular, mypdfsigner and acrobat reader.

~/pyhanko$ pyhanko sign validate a.pdf
2021-06-09 09:17:30,409 - pyhanko.cli - ERROR - An error occurred while validating this signature: sha1
Traceback (most recent call last):
  File "/home/fernando/.local/lib/python3.8/site-packages/pyhanko/cli.py", line 327, in _signature_status
    status = validation.validate_pdf_signature(
  File "/home/fernando/.local/lib/python3.8/site-packages/pyhanko/sign/validation.py", line 1263, in validate_pdf_signature
    status_kwargs = _validate_cms_signature(
  File "/home/fernando/.local/lib/python3.8/site-packages/pyhanko/sign/validation.py", line 160, in _validate_cms_signature
    intact, valid = validate_sig_integrity(
  File "/home/fernando/.local/lib/python3.8/site-packages/pyhanko/sign/general.py", line 750, in validate_sig_integrity
    raise WeakHashAlgorithmError(md_algorithm)
pyhanko.sign.general.WeakHashAlgorithmError: sha1
Signature1:37e897b3a8e2095f2dc286ddf487e707a5cfc9260726a65252e9e47a3fbf8f17:INVALID

Any hints?

FernandoJCabral commented 3 years ago

Well, I made the command line work. Neverthess, I couldn't find a way to control font size for the stamp-styles. Is there a way to specify the font size?

MatthiasValvekens commented 3 years ago

I tried to check the signature with the commando pyhanko sign validate a.pdf. Nevertheless, I got the messages I reproduce bellow. The same signature has been correctly found and identified by okular, mypdfsigner and acrobat reader.

As the error message indicates, pyHanko is complaining that the signature uses a weak hashing algorithm (SHA-1), not that it doesn't find the signature. SHA-1 is broken, and it's a shame that none of the big name PDF readers actually make a stink about that. The industry is very averse to breaking compatibility with older workflows...

You can't turn off this behaviour from the CLI right now, but if you're an API user you can pass in a different set of weak hash algorithms to your ValidationContext. The default list includes SHA-1, MD5 and MD2.

Is there a way to specify the font size?

The TextBoxStyle you pass to a text stamp style has a font_size parameter that you can use. That said, the stamping routine's auto-scaling might get in the way of that. All things considered, pyHanko's appearance generation code isn't very good, and I'm working on some improvements in that department, but it's a slow process.

(Current docs for TextBoxStyle: https://pyhanko.readthedocs.io/en/latest/api-docs/pyhanko.pdf_utils.text.html#pyhanko.pdf_utils.text.TextBoxStyle)

FernandoJCabral commented 3 years ago

Thank you, Matthias. With your help I am almost there.

I don't understand why pyHanko can't build a "validation path". As far as I can see, all the information is available on the token. See error message bellow:

  raise PathBuildingError(pretty_message(
pyhanko_certvalidator.errors.PathBuildingError: Unable to build a validation path for the certificate "Common Name: FERNANDO JOSE CASTRO CABRAL:12436666687; Organizational Unit: 26768764000115, AR CONFIANCA EMPREENDIMENTOS DIGITAL, VALID, RFB e-CPF A3, Secretaria da Receita Federal do Brasil - RFB; Organization: ICP-Brasil; Country: BR" - no issuer matching "Common Name: AC VALID RFB v5, Organizational Unit: Secretaria da Receita Federal do Brasil - RFB, Organization: ICP-Brasil, Country: BR" was found

To me, at least, the trust chain seems to be correct and complete.

MatthiasValvekens commented 3 years ago

That's probably because pyHanko's PKCS#11 signer doesn't fetch anything from the token unless you tell it to. :) That's for performance reasons, since things like smart cards can be painfully slow to interact with. In addition to that, PKCS#11 doesn't allow you to ask "Hey, is there a certificate for entity such-and-such on this token? If so, send it over please.".

Basically, your options are to either (a) grab all certs from the token (if the total number is small, this is perfectly OK), or (b) tell pyHanko which PKCS#11 certificate labels that it should query. The latter is feasible if you know the labels ahead of time (as is the case in the BEIDSigner class, for example).

Bottom line: you'll want to set the other_certs_to_pull and/or bulk_fetch parameters, see here. That should solve it. If not, there may be other issues. :)

EDIT: Oh, and obviously: you need to configure the relevant root certificate as a trust root in the validation context if it's not in your system trust list already. That wasn't the cause of the error you're seeing, but you'll probably want to double-check that too.

FernandoJCabral commented 3 years ago

Matthias, sorry for bothering you with small stuff like this, but I couldn't find any references on the docs. Is there a way to pass the user pin on the CLI? Or perhaps in the configuration file? (I understand it can be done with the API, but at this moment I am trying to fully understand the CLI before adventuring into the API).

Thank you

MatthiasValvekens commented 3 years ago

Hmmm, I thought that Python's getpass would also work if stdin is not a TTY (thus allowing you to pass the PIN with a pipe), but apparently that's not the case. So for now, the answer seems to be 'no'.

I'll put in some plumbing around getpass to make that possible. I could also put in a flag to read the PIN code from a file, if that's easier. Putting it in the configuration file is something I'd prefer to avoid, though.

I'm also not planning to provide an easy option to pass the PIN as a command line argument directly. On a typical Linux system, the argument list of any given process is globally visible, which is problematic. That kind of exposure can be acceptable for a file password (hence why the decrypt command does have a --password flag for those who really want it), but for a PKCS#11 token PIN, the stakes are quite a bit higher.

I've added it to the backlog :)

EDIT: You should be able to input your PIN from the terminal prompt, though, unless your PKCS#11 token manages its own PIN input (that sometimes happens). In that case, you should use the --skip-user-pin flag.

FernandoJCabral commented 3 years ago

Hmmm, I thought that Python's getpass would also work if stdin is not a TTY (thus allowing you to pass the PIN with a pipe), but apparently that's not the case. So for now, the answer seems to be 'no'.

I'll put in some plumbing around getpass to make that possible. I could also put in a flag to read the PIN code from a file, if that's easier. Putting it in the configuration file is something I'd prefer to avoid, though.

I'm also not planning to provide an easy option to pass the PIN as a command line argument directly. On a typical Linux system, the argument list of any given process is globally visible, which is problematic. That kind of exposure can be acceptable for a file password (hence why the decrypt command does have a --password flag for those who really want it), but for a PKCS#11 token PIN, the stakes are quite a bit higher.

I've added it to the backlog :)

EDIT: You should be able to input your PIN from the terminal prompt, though, unless your PKCS#11 token manages its own PIN input (that sometimes happens). In that case, you should use the --skip-user-pin flag.

I understand your point. The problem I face is that I use to call the pdf signer from my text editor (libreoffice). Using a python macro and subprocess.run(), I can save the odt file, create a PDF from the same file, calculate where I want the visible signature to appear, sign the file and have it saved. All with a single command. In this scenario, I don't have access to the standard i/o.

The two solution I know of are jsignpdf, which takes the pin from the command line and mypdfsigner, which takes the pin from a file.

Mypdfsigner's config file accepts an encrypted password. I would guess this is not very safe, but there are some protection since only the file's owner and the administrator could read that file.

As to jsignpdf, I have never checked if the pin provided on the command line lingers around, but I would guess it would be possíble to delete any argument manipulating the sys.argv[] (I'll check if this is true. It used to be many decades ago when linux did not exist, unix was king and C was the preferred language in that environment).

Best regards.

MatthiasValvekens commented 3 years ago

I see. About putting it in the configuration: I agree that that would probably be fine in terms of information leaks, since you can (as you say) provide some extra protection by setting proper permissions on the config file. I was mostly thinking about setups where you would need to be able to switch between multiple different PKCS#11 setups.

That said, I could maybe add a pkcs11-env section to the configuration file, containing (possibly several) PKCS#11 token configurations, including a PIN if you so desire. You can already do that for validation contexts, so I guess it makes sense to extend the same courtesy to PKCS#11 settings. Would that work?

FernandoJCabral commented 3 years ago

I see. About putting it in the configuration: I agree that that would probably be fine in terms of information leaks, since you can (as you say) provide some extra protection by setting proper permissions on the config file. I was mostly thinking about setups where you would need to be able to switch between multiple different PKCS#11 setups.

That said, I could maybe add a pkcs11-env section to the configuration file, containing (possibly several) PKCS#11 token configurations, including a PIN if you so desire. You can already do that for validation contexts, so I guess it makes sense to extend the same courtesy to PKCS#11 settings. Would that work?

Yes, I think that would work nicely.

MatthiasValvekens commented 3 years ago

Great! Then I'll change the title of this issue accordingly, and slap on an 'enhancement' tag. Thanks for the productive conversation :)

MatthiasValvekens commented 3 years ago

Hi, I've addressed this issue in commit f05b781. I haven't added it to the CLI documentation yet, but in pyHanko 0.7.0 you'll be able to put this in your config file:

pkcs11-setups:
  test-setup:
    module-path: /usr/lib/libsofthsm2.so
    token-label: testrsa
    cert-label: signer
    user-pin: 1234

and invoke it with

 pyhanko sign addsig pkcs11 --p11-setup test-setup input.pdf output.pdf

I'll close this issue now. Feel free to reopen if you have other questions related to this particular item, but please open a new issue for any other questions you might have. Thanks!


PS: Last weekend, I also pushed the first "batch" of the appearance generation code overhaul I've been working on for the last couple of weeks or so. It's not quite where I want it yet, but provided that I manage to expose enough of the new settings from the configuration file by the time 0.7.0 rolls around, you should get some more control over your signature appearances as well.