eu-digital-green-certificates / dgc-testdata

Repository for storing generated QR code data for testing.
Apache License 2.0
157 stars 218 forks source link

Useless compression / worse than doing nothing #284

Open jeroentrappers opened 3 years ago

jeroentrappers commented 3 years ago

Issue Description

In the processing pipeline for creating a DCC, the COSE structure first compressed before being BASE45 encoded. There are many files where this compression step has adverse effect on the final size of the data to be encoded. This depends on the compressibility of the input and the used parameters for the compression algorithm.

dgc-testdata/common/2DCode/raw/CO1.json Warning: useless compression. Compressed size (583) is larger than raw COSE (572).

dgc-testdata/common/2DCode/raw/CO2.json Warning: useless compression. Compressed size (711) is larger than raw COSE (700).

dgc-testdata/common/2DCode/raw/CO28.json Warning: useless compression. Compressed size (352) is larger than raw COSE (348).

dgc-testdata/common/2DCode/raw/CO3.json Warning: useless compression. Compressed size (384) is larger than raw COSE (378).

dgc-testdata/IT/2DCode/raw/1.json Warning: useless compression. Compressed size (348) is larger than raw COSE (343).

dgc-testdata/FR/2DCode/raw/vaccin_ok.json Warning: useless compression. Compressed size (331) is larger than raw COSE (320).

dgc-testdata/SI/2DCode/raw/4.json Warning: useless compression. Compressed size (356) is larger than raw COSE (353).

dgc-testdata/SI/2DCode/raw/5.json Warning: useless compression. Compressed size (360) is larger than raw COSE (356).

dgc-testdata/SI/2DCode/raw/3.json Warning: useless compression. Compressed size (559) is larger than raw COSE (548).

dgc-testdata/SI/2DCode/raw/6.json Warning: useless compression. Compressed size (359) is larger than raw COSE (354).

dgc-testdata/SI/2DCode/raw/1.json Warning: useless compression. Compressed size (358) is larger than raw COSE (354).

dgc-testdata/PT/2DCode/raw/2.json Warning: useless compression. Compressed size (341) is larger than raw COSE (337).

dgc-testdata/PT/2DCode/raw/1.json Warning: useless compression. Compressed size (342) is larger than raw COSE (337).

dgc-testdata/RO/2DCode/raw/1.json Warning: useless compression. Compressed size (326) is larger than raw COSE (317).

dgc-testdata/IS/2DCode/raw/5.json Warning: useless compression. Compressed size (568) is larger than raw COSE (563).

dgc-testdata/SE/2DCode/raw/4.json Warning: useless compression. Compressed size (379) is larger than raw COSE (368).

dgc-testdata/SE/2DCode/raw/2.json Warning: useless compression. Compressed size (403) is larger than raw COSE (394).

dgc-testdata/SE/2DCode/raw/3.json Warning: useless compression. Compressed size (401) is larger than raw COSE (393).

dgc-testdata/SE/2DCode/raw/1.json Warning: useless compression. Compressed size (346) is larger than raw COSE (340).

dgc-testdata/CY/2DCode/raw/5.json Warning: useless compression. Compressed size (356) is larger than raw COSE (345).

dgc-testdata/CY/2DCode/raw/6.json Warning: useless compression. Compressed size (356) is larger than raw COSE (345).

dgc-testdata/SK/2DCode/raw/4.json Warning: useless compression. Compressed size (406) is larger than raw COSE (405).

dgc-testdata/SK/2DCode/raw/2.json Warning: useless compression. Compressed size (390) is larger than raw COSE (386).

dgc-testdata/SK/2DCode/raw/5.json Warning: useless compression. Compressed size (421) is larger than raw COSE (418).

dgc-testdata/SK/2DCode/raw/1.json Warning: useless compression. Compressed size (411) is larger than raw COSE (406).

dgc-testdata/NL/2DCode/raw/043-NL-vaccination.json Warning: useless compression. Compressed size (378) is larger than raw COSE (377).

dgc-testdata/NL/2DCode/raw/142-NL-recovery.json Warning: useless compression. Compressed size (346) is larger than raw COSE (345).

dgc-testdata/NL/2DCode/raw/064-NL-vaccination.json Warning: useless compression. Compressed size (385) is larger than raw COSE (383).

dgc-testdata/NL/2DCode/raw/049-NL-vaccination.json Warning: useless compression. Compressed size (377) is larger than raw COSE (371).

dgc-testdata/NL/2DCode/raw/136-NL-recovery.json Warning: useless compression. Compressed size (336) is larger than raw COSE (334).

dgc-testdata/NL/2DCode/raw/083-NL-vaccination.json Warning: useless compression. Compressed size (379) is larger than raw COSE (370).

dgc-testdata/NL/2DCode/raw/135-NL-recovery.json Warning: useless compression. Compressed size (334) is larger than raw COSE (331).

dgc-testdata/NL/2DCode/raw/336-NL-test+wrong_key.json Warning: useless compression. Compressed size (365) is larger than raw COSE (362).

dgc-testdata/NL/2DCode/raw/003-NL-test.json Warning: useless compression. Compressed size (371) is larger than raw COSE (370).

dgc-testdata/NL/2DCode/raw/140-NL-recovery.json Warning: useless compression. Compressed size (339) is larger than raw COSE (338).

dgc-testdata/NL/2DCode/raw/050-NL-vaccination.json Warning: useless compression. Compressed size (345) is larger than raw COSE (342).

dgc-testdata/NL/2DCode/raw/042-NL-vaccination.json Warning: useless compression. Compressed size (369) is larger than raw COSE (366).

dgc-testdata/NL/2DCode/raw/137-NL-recovery.json Warning: useless compression. Compressed size (336) is larger than raw COSE (334).

dgc-testdata/NL/2DCode/raw/361-NL-test+wrong_key.json Warning: useless compression. Compressed size (343) is larger than raw COSE (337).

dgc-testdata/NL/2DCode/raw/126-NL-recovery.json Warning: useless compression. Compressed size (349) is larger than raw COSE (345).

dgc-testdata/NL/2DCode/raw/127-NL-recovery.json Warning: useless compression. Compressed size (350) is larger than raw COSE (348).

dgc-testdata/NL/2DCode/raw/131-NL-recovery.json Warning: useless compression. Compressed size (353) is larger than raw COSE (349).

dgc-testdata/NL/2DCode/raw/089-NL-vaccination.json Warning: useless compression. Compressed size (361) is larger than raw COSE (357).

dgc-testdata/NL/2DCode/raw/134-NL-recovery.json Warning: useless compression. Compressed size (333) is larger than raw COSE (331).

dgc-testdata/NL/2DCode/raw/079-NL-vaccination.json Warning: useless compression. Compressed size (379) is larger than raw COSE (368).

dgc-testdata/NL/2DCode/raw/031-NL-test.json Warning: useless compression. Compressed size (369) is larger than raw COSE (358).

dgc-testdata/NL/2DCode/raw/343-NL-test+wrong_key.json Warning: useless compression. Compressed size (374) is larger than raw COSE (371).

dgc-testdata/NL/2DCode/raw/091-NL-vaccination.json Warning: useless compression. Compressed size (342) is larger than raw COSE (335).

dgc-testdata/NL/2DCode/raw/129-NL-recovery.json Warning: useless compression. Compressed size (353) is larger than raw COSE (351).

dgc-testdata/NL/2DCode/raw/358-NL-test+wrong_key.json Warning: useless compression. Compressed size (382) is larger than raw COSE (371).

dgc-testdata/NL/2DCode/raw/357-NL-test+wrong_key.json Warning: useless compression. Compressed size (369) is larger than raw COSE (363).

dgc-testdata/NL/2DCode/raw/334-NL-test+wrong_key.json Warning: useless compression. Compressed size (366) is larger than raw COSE (359).

dgc-testdata/NL/2DCode/raw/070-NL-vaccination.json Warning: useless compression. Compressed size (430) is larger than raw COSE (429).

dgc-testdata/NL/2DCode/raw/138-NL-recovery.json Warning: useless compression. Compressed size (336) is larger than raw COSE (334).

dgc-testdata/NL/2DCode/raw/084-NL-vaccination.json Warning: useless compression. Compressed size (384) is larger than raw COSE (377).

dgc-testdata/NL/2DCode/raw/019-NL-test.json Warning: useless compression. Compressed size (454) is larger than raw COSE (449).

dgc-testdata/NL/2DCode/raw/133-NL-recovery.json Warning: useless compression. Compressed size (331) is larger than raw COSE (328).

dgc-testdata/NL/2DCode/raw/128-NL-recovery.json Warning: useless compression. Compressed size (351) is larger than raw COSE (348).

dgc-testdata/NL/2DCode/raw/056-NL-vaccination.json Warning: useless compression. Compressed size (419) is larger than raw COSE (418).

dgc-testdata/NL/2DCode/raw/332-NL-test+wrong_key.json Warning: useless compression. Compressed size (360) is larger than raw COSE (358).

dgc-testdata/NL/2DCode/raw/130-NL-recovery.json Warning: useless compression. Compressed size (354) is larger than raw COSE (351).

dgc-testdata/LU/2DCode/raw/INCERT_R_DCC_Vaccination.json Warning: useless compression. Compressed size (336) is larger than raw COSE (325).

dgc-testdata/LU/2DCode/raw/INCERT_R_DCC_NAAT.json Warning: useless compression. Compressed size (418) is larger than raw COSE (413).

dgc-testdata/LU/2DCode/raw/INCERT_R_DCC_RAT.json Warning: useless compression. Compressed size (358) is larger than raw COSE (351).

dgc-testdata/AT/2DCode/raw/1.json Warning: useless compression. Compressed size (383) is larger than raw COSE (378).

dgc-testdata/CZ/2DCode/raw/5.json Warning: useless compression. Compressed size (357) is larger than raw COSE (348).

dgc-testdata/CZ/2DCode/raw/1.json Warning: useless compression. Compressed size (371) is larger than raw COSE (364).

dgc-testdata/BE/2DCode/raw/3.json Warning: useless compression. Compressed size (362) is larger than raw COSE (357).

dgc-testdata/BE/2DCode/raw/1.json Warning: useless compression. Compressed size (348) is larger than raw COSE (337).

dgc-testdata/GR/2DCode/raw/1.json Warning: useless compression. Compressed size (349) is larger than raw COSE (340).

dgc-testdata/PL/2DCode/raw/2.json Warning: useless compression. Compressed size (381) is larger than raw COSE (372).

dgc-testdata/PL/2DCode/raw/6.json Warning: useless compression. Compressed size (383) is larger than raw COSE (372).

dgc-testdata/PL/2DCode/raw/1.json Warning: useless compression. Compressed size (381) is larger than raw COSE (372).

dgc-testdata/PL/2DCode/raw/9.json Warning: useless compression. Compressed size (374) is larger than raw COSE (371).

dgc-testdata/DE/2DCode/raw/4.json Warning: useless compression. Compressed size (366) is larger than raw COSE (355).

dgc-testdata/DE/2DCode/raw/3.json Warning: useless compression. Compressed size (319) is larger than raw COSE (318).

dgc-testdata/DE/2DCode/raw/1.json Warning: useless compression. Compressed size (366) is larger than raw COSE (355).

dgc-testdata/LI/2DCode/raw/3.json Warning: useless compression. Compressed size (363) is larger than raw COSE (355).

dgc-testdata/CH/2DCode/raw/2.json Warning: useless compression. Compressed size (602) is larger than raw COSE (591).

dgc-testdata/CH/2DCode/raw/3.json Warning: useless compression. Compressed size (537) is larger than raw COSE (526).

dgc-testdata/CH/2DCode/raw/1.json Warning: useless compression. Compressed size (565) is larger than raw COSE (554).

dgc-testdata/LV/2DCode/raw/1.json Warning: useless compression. Compressed size (374) is larger than raw COSE (366).

Proposed Solution

Use better compression, or do not compress when no size reduction can be gained.

psavva commented 3 years ago

We should either compress is not. Please do not have an inconsistent approach where we only may compress sometimes...

Better consider using a better algorithm for compression

martin-lindstrom commented 3 years ago

The thing is that when the hcert-spec was written it was said that more than one entry should be able to appear in the DCC.

JChrist commented 3 years ago

just noting that it is possible to identify if the payload is actually compressed, by reading the zlib magic headers (the first two bytes):

    Zlib magic headers (!! IN HEX !!):
    78 01 - No Compression/low
    78 9C - Default Compression
    78 DA - Best Compression

So, if the first byte is 78 hex (120 decimal), you can assume that it is compressed.

vitorpamplona commented 3 years ago

Zlib over CBOR is only useful if there are several repeated strings in the payload.

It was more effective before the requirement to have only one payload type into the QR was decided. When we had 3+ credentials, most of them would come from the same jurisdiction and the zlib would do its magic.

If the payload is a single event, it does very little to the payload.

vitorpamplona commented 3 years ago

Now that we have test data from almost everyone, we can run a study on how much the zLib is truly compressing and choose the threshold we want. For instance, if the maximum compression is 5%, do we even care about including zLib?

Including zLib and any other compression mechanism adds complexity, requires processing power, slows things up from issuers and verifiers. The question is: how much compression is worth paying with side effects?