quarkslab / irmacl

irma api command line client
Apache License 2.0
7 stars 4 forks source link

[irmacl.helpers.scan_data] Base64 decoding error #6

Open lukasmayer opened 5 years ago

lukasmayer commented 5 years ago

Some problemes occur with the Base64-decoding of certain virus files. Probably those strings are made to mess with Base64-decoders, since they occur as base64-encoded E-Mail attachments.

Different to decoding the Base64-String with Pythons's email or base64 modules, the string submitted to IRMA with irmacl.helpers.scan_data() is not decoded. Therefore the files have different hashes, different sizes, different filetypes (ASCII-text) and different results than their binary equivalent.

Anyone any idea where to look for a solution preventing such decoding errors? How does IRMA decode base64 strings it receives via API?

guillaumededrie commented 5 years ago

Hello @lukasmayer, Can you tell me on what version you are experiencing issues? (version of irmacl and irma-api).

Normally, the base64 decoding when receiving an API response is perform by the json.loads function https://github.com/quarkslab/irmacl/blob/v2.0/irmacl/apiclient.py#L168, and when using HTTP GET call, by encode python function (https://github.com/quarkslab/irmacl/blob/v2.0/irmacl/apiclient.py#L116).

Can you provide us a test case with a file and scenario that fail?

Regards, -- Guillaume

lukasmayer commented 5 years ago

Hello @guillaumededrie sorry for the somehow late response. I am using IRMA Release OSS v2.2.2 and irmacl 2.0.7.

Behind the Link down below you will find a ZIP-archive with a file example. You find it as binary as decoded by Python 3.7's email-Module which should behave same as Python's base64-Module. The base64-Version is at it has been received via E-Mail. Furthermore the archive contains two screenshots of the corresponding file info. One is the base64 submitted via irmacl.helpers.scan_data(), the other one is the base64-decoded binary submitted via Firefox.

Maybe the use of the base64-Module could prevent those issues. Files like that one occur from time to time and are probably designed to break simple decoding.

https://cryptpad.fr/file/#/2/file/dUk0vfkJQ+zWJXCivmq8RwYT/p/ (Since the archive contains malware, you are able to extract the files with the password 'infected'.)