dpn-admin / EXANode-Acceptance-testing

EXANode Software and data flow testing
0 stars 0 forks source link

Failed to detect errors in deposit #18

Closed Pcolar closed 6 years ago

Pcolar commented 6 years ago

https://exanode-demo.libnova.com/deposit/details/00000000000000000079

This deposit was marked as archived despite repeated failures in the ingest log.

ERROR: Could not connect to clamd on (null): Connection refused [2018-06-11 14:15:21.810017] [w] [libsafeCharacterizerService.exe::__scan] Can't complete file scan: [*] File: "C:\exanode\ingestion\00000000000000000079\4ec0358b-6021-45fb-96c9-fe6c44a8c68d.tar"

What conditions constitute success, warnings, or failure status of an ingest?

acarrasco-libnova commented 6 years ago

This error may lead to confussion, maybe needs some clarification. When the characterization module starts to process a deposit, it performs an initialization sequence:

Even with the antivirus service online (as with the Windows Services API), the antivirus engine has to load in memory the new definitions database, and this can take some time. As there is no reliable way to determine whether the antivirus engine has finished loading or not using a direct request against the service, it will perform "dummy" calls against the antivirus engine, until it returns a valid response; this will point that the service is online AND the virus definitions are completely loaded into the process memory.

As you can see later in the ingest log of that deposit, the file that is not able to scan at the beginning, is prcessed later:

 [*] File ID:           1864779
 [*] Original filename: 4ec0358b-6021-45fb-96c9-fe6c44a8c68d.tar
 [*] Archive  filename: DEPOSIT-D01-00001-DPN-000000000000079-20180611-000000001864779-SHA256-F482F4114F16F22AF7F3D62237D03EBB82099FDB4D0159ED249AF2C18688B6E6.DPN
 [*] Pronom Format:     x-fmt/265
 [*] MIME Type:         application/x-tar
 [*] SHA1 Hash:         SHA256:f482f4114f16f22af7f3d62237d03ebb82099fdb4d0159ed249af2c18688b6e6

So, the file has been processed, which is both characterize and virus scan that file (is not like it just failed and was skipped). However, these error messages can lead to misanderstand the status of the deposit, i agree, but undoubtly the deposit content (1 single file) was processed without any problem.

We will upgrade the module to avoid those messages to appear in the ingestion log, or reduce the amount of those messages.

Pcolar commented 6 years ago

Verified, based on report from LibNova.