Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net
https://www.clamav.net/
GNU General Public License v2.0
4.41k stars 705 forks source link

"Can't parse data ERROR" message not helpful for routing decisions #575

Open Daniel-Nashed opened 2 years ago

Daniel-Nashed commented 2 years ago

Hi,

I know it is difficult to always return a meaningful status.
But having a status Can't parse data ERROR returned on a normal attachment like a PPTX file, is not really helpful on a messaging gateway, where you have to take a routing decision, based on the status returned.

What should we do when we get a status like this?
This is potentially blocking proper messages, if we assume this status as a potential virus.

I am not sure which type of files are hitting this error mainly and which component of ClamAV can't scan it.
In my first tests I have seen this with PowerPoint files and ZIP files containing PowerPoint files.

I used clamd over a TCP/IP connection into a Docker container. But I also verified copying it into the container and letting clamd scan it with a scan command to ensure my transport is not part of the problem.

As the caller we don't know what attachment type it is (we can't just use the file extension).
So we can't take any proper decision on the status based on file type.

If only the parsing of the PPTX fails and other checks are OK, a different status would make sense.
Or at least if the scanner detects a certain file type, which then cannot be parsed, would be already more helpful too.

Maybe there is configuration I am not aware of?

Thanks

Daniel

clamconf -n

Config file: clamd.conf
-----------------------
LogFile = "/var/log/clamav/clamd.log"
LogTime = "yes"
PidFile = "/run/lock/clamd.pid"
LocalSocket = "/run/clamav/clamd.sock"
TCPSocket = "3310"
User = "clamav"

Config file: freshclam.conf
---------------------------
PidFile = "/run/lock/freshclam.pid"
UpdateLogFile = "/var/log/clamav/freshclam.log"
DatabaseMirror = "database.clamav.net"

Config file: clamav-milter.conf
-------------------------------
LogFile = "/var/log/clamav/milter.log"
LogTime = "yes"
PidFile = "/run/lock/clamav-milter.pid"
User = "clamav"
ClamdSocket = "unix:/run/clamav/clamd.sock", "unix:/run/clamav/clamd.sock", "unix:/run/clamav/clamd.sock", "unix:/run/clamav/clamd.sock", "unix:/run/clamav/clamd.sock"
MilterSocket = "inet:7357"

Software settings
-----------------
Version: 0.105.0
Optional features supported: MEMPOOL AUTOIT_EA06 BZIP2 LIBXML2 PCRE2 ICONV JSON RAR

Database information
--------------------
Database directory: /var/lib/clamav
bytecode.cvd: version 333, sigs: 92, built on Mon Mar  8 15:21:51 2021
main.cvd: version 62, sigs: 6647427, built on Thu Sep 16 12:32:42 2021
daily.cld: version 26534, sigs: 1983823, built on Sat May  7 08:05:06 2022
Total number of signatures: 8631342

Platform information
--------------------
uname: Linux 5.10.60.1-microsoft-standard-WSL2 #1 SMP Wed Aug 25 23:20:18 UTC 2021 x86_64
OS: Linux, ARCH: x86_64, CPU: x86_64
zlib version: 1.2.12 (1.2.12), compile flags: a9
platform id: 0x0a21969608000000000a0301

Build information
-----------------
GNU C: 10.3.1 20211027 (10.3.1)
sizeof(void*) = 8
Engine flevel: 150, dconf: 150
micahsnyder commented 2 years ago

Hi @Daniel-Nashed sorry you didn't get a response sooner. I agree it's not helpful. I would ignore it for the purposes of deciding whether to keep or drop a message.

I see that you're using ClamAV 0.105.0, so suspect the issue you're seeing is related to #600 which is unfortunately very common and highly problematic. I am working on a fix for it. Sorry again about the trouble.

Regards, Micah

Daniel-Nashed commented 2 years ago

@micahsnyder, thanks for your reply! If there is anything you want me to test, I can do that any time.

A friend is using a different config outside Docker and cannot confirm the issue. So it could be also based on some default config, that the Docker image comes with.

Let me know any time if you need more traces or a test file etc.

Thanks

Daniel

micahsnyder commented 2 years ago

@Daniel-Nashed there are other files that report errors. sometimes. But the most common one with 0.105.0 has to do with a bug in clamav error handling if it fails to calculate a fuzzy hash for images. I have a fix in progress here: https://github.com/Cisco-Talos/clamav/pull/618

You're welcome to build with it and test it, or else if you have the file that is causing problems for you and can share it I can test it. If you can share it but need to share it privately, you could email it to me at micasnyd [ at ] cisco.com, or transfer it to me over Discord. I'm on this server with the micah_s username: https://discord.gg/6vNAqWnVgw