open-eid / digidoc4j

DigiDoc for Java. Javadoc:
http://open-eid.github.io/digidoc4j
GNU Lesser General Public License v2.1
73 stars 39 forks source link

Allow overriding AsicContainerParser::parseSignatures #112

Open mbakhoff opened 2 years ago

mbakhoff commented 2 years ago

Opening an existing container with signatures in the current version will trigger loading each file fully in memory. If any of the files does not fit, then the loader will crash with OOM.

Allow overriding parseSignatures so that custom implementations can implement workarounds such as using DigestDocument for signature validation.

Signed-off-by: Märt Bakhoff mbakhoff@sigil.red

mbakhoff commented 2 years ago

Somewhat related issue https://github.com/open-eid/digidoc4j/issues/54

rsarendus commented 2 years ago

Hello! What exactly is causing out of memory errors?

Signature files themselves are already loaded fully into memory in AsicContainerParser prior to calling the parseSignatures method. Only thing that comes to mind which could balloon them significantly during parsing, is if they contain CRL-s.

Or does the problem only occur later during validation and you just require a callback for injecting your own custom XadesSignature implementations into the container during when it's loaded?


Allow overriding parseSignatures so that custom implementations can implement workarounds such as using DigestDocument for signature validation.

For signature levels lower than LTA, the calculation of digests for data files should be performed via streaming the contents of the files, and thus the memory footprint of this operation should be minimal. If you have found that this is not true in some cases, then we would appreciate if you could open a bug report and provide us as much information as possible for us to reproduce the bug.

For LTA signatures, DigestDocument is not supported for representing data files, as caclulating the digest for the archival timestamp requires the digest to be calculated over the concatenated contents of all the data files.


In case the problem is caused by large data files residing in memory after a container has been parsed, then you could try to configure the setMaxFileSizeCachedInMemoryInMB via the Configuration class, if you haven't tried that yet. This should allow to load data files as StreamDocuments, which dump the data files into the file system as temporary files and stream their contents on demand.

mbakhoff commented 2 years ago

Thanks for the quick response! I've created an issue with repro steps.