Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net
https://www.clamav.net/
GNU General Public License v2.0
4.33k stars 700 forks source link

statically linked (small-ish) elf-binary (ppc64le) leads to aborted scans (MaxScanSize) #725

Open ccwienk opened 2 years ago

ccwienk commented 2 years ago

Describe the bug

Scanning the /bin/node_exporter binary (ppc64le flavour) from node-exporter - v1.3.1 consistently leads to aborted scans (pseudo-virus "Heuristics.Limits.Exceeded.MaxScanSize" is emitted).

I tested this w/ both the current alpine version of clamav (ClamAV 0.104.3/26568/Fri Jun 10 08:06:23 2022) and archlinux (ClamAV 0.105.1/26693/Tue Oct 18 10:02:42 2022), in both cases w/ up-to-date signature-databases.

The file in question has 17861624 octects (roughly 18 MiB), which is a lot smaller than configured max-scan-size (which I set to allowed maximum of 4 GiB). I also successfully scanned other files of larger sizes w/o observing the same behaviour. Including, but not limited to the other platform-flavours of node-exporter binary.

How to reproduce the problem

Either retrieve OCI-Container-Image from location indicated above, or retrieve this issue's attachment:

node_exporter.gz

In case of retrieving the uploaded attachment, un-gzip it prior to scanning (the bug will not occur if scanning the gzipped file).

Run clamdscan <path/to/node_exporter>

output of clamconf -n

$ clamconf -n
Checking configuration files in /etc/clamav

Config file: clamd.conf
-----------------------
AlertExceedsMax = "yes"
LogFile = "/var/log/clamav/clamd.log"
LogTime = "yes"
PidFile = "/run/clamav/clamd.pid"
TemporaryDirectory = "/tmp"
LocalSocket = "/run/clamav/clamd.ctl"
StreamMaxLength = "536870912"
User = "clamav"
MaxScanSize = "4294967295"
MaxFileSize = "4294967295"

Config file: freshclam.conf
---------------------------
PidFile = "/run/clamav/freshclam.pid"
UpdateLogFile = "/var/log/clamav/freshclam.log"
DatabaseMirror = "database.clamav.net"

Config file: clamav-milter.conf
-------------------------------
LogFile = "/var/log/clamav/clamav-milter.log"
LogTime = "yes"
PidFile = "/run/clamav/clamav-milter.pid"
TemporaryDirectory = "/tmp"
User = "clamav"

Software settings
-----------------
Version: 0.105.1
Optional features supported: MEMPOOL AUTOIT_EA06 BZIP2 LIBXML2 PCRE2 ICONV JSON RAR 

Database information
--------------------
Database directory: /var/lib/clamav
main.cvd: version 62, sigs: 6647427, built on Thu Sep 16 14:32:42 2021
daily.cvd: version 26693, sigs: 2008500, built on Tue Oct 18 10:02:42 2022
bytecode.cvd: version 333, sigs: 92, built on Mon Mar  8 16:21:51 2021
Total number of signatures: 8656019

Platform information
--------------------
uname: Linux 6.0.1-arch2-1 #1 SMP PREEMPT_DYNAMIC Thu, 13 Oct 2022 18:58:49 +0000 x86_64
OS: Linux, ARCH: x86_64, CPU: x86_64
Full OS version: "Arch Linux"
WARNING: zlib version mismatch: 1.2.12 (1.2.13)
zlib version: 1.2.12 (1.2.13), compile flags: a9
platform id: 0x0a21979708000000000c0101

Build information
-----------------
GNU C: 12.1.1 20220730 (12.1.1)
sizeof(void*) = 8
Engine flevel: 151, dconf: 151

Attachments

node_exporter.gz

micahsnyder commented 2 years ago

Interesting find!

What's happening here is that clam attempts to find embedded archives. If a match for the starting bytes for those are found, it will attempt to extract their content to scan it. In this case, Clam thinks it has found an ARJ, ZIP, and a RAR! The ARJ and ZIP parsers fail out pretty fast because they're invalid, but the RAR parser gets far enough to try to extract what it thinks is an absolutely massive file. That fails because it would exceed the scan limits, and thus this alert occurs.

I tested with a minimal signature set. This is what my debug log shows:


LibClamAV debug: Matched signature for file type HTML data
LibClamAV debug: Matched signature for file type HTML data
LibClamAV debug: Matched signature for file type HTML data
LibClamAV debug: Matched signature for file type HTML data at 7765293
LibClamAV debug: Matched signature for file type ARJ-SFX at 11439728
LibClamAV debug: Matched signature for file type ZIP-SFX at 12452156
LibClamAV debug: Matched signature for file type RAR-SFX at 12452796
LibClamAV debug: hashtab: Freeing hashset, elements: 0, capacity: 0
LibClamAV debug: CL_TYPE_ARJSFX signature found at 11439728
LibClamAV debug: in cli_scanarj()
LibClamAV debug: in cli_unarj_open
LibClamAV debug: Header Size: 27
LibClamAV debug: ARJ Main File Header
LibClamAV debug: First Header Size: 0
LibClamAV debug: Version: 0
LibClamAV debug: Min version: 0
LibClamAV debug: Host OS: 0
LibClamAV debug: Flags: 0x9e
LibClamAV debug: Security version: 0
LibClamAV debug: File type: 2
LibClamAV debug: Format error. First Header Size < 30
LibClamAV debug: Failed to read main header
LibClamAV debug: ARJ: Error: Bad format or broken data
LibClamAV debug: CL_TYPE_ZIPSFX signature found at 12452156
LibClamAV debug: in cli_unzip_single
LibClamAV debug: cli_basename: Provided path does not include a file name.
LibClamAV debug: cli_unzip: local header - ZMDNAME:1::4294901760:589823:ffff0a0d:29798:0:1
LibClamAV debug: CDBNAME:CL_TYPE_ZIP:589823::589823:4294901760:1:0:4294904333:(nil)
LibClamAV debug: cli_unzip: local header - header has got unusable masked data
LibClamAV debug: CL_TYPE_RARSFX signature found at 12452796
LibClamAV debug: fmap_dump_to_file: dumping fmap not backed by file...
LibClamAV debug: in scanrar()
unrar_open: Comments are not present in this archive.
unrar_open: Volume attribute (archive volume):              no
unrar_open: Archive comment present:                        no
unrar_open: Archive lock attribute:                         no
unrar_open: Solid attribute (solid archive):                no
unrar_open: New volume naming scheme ('volname.partN.rar'): no
unrar_open: Authenticity information present (obsolete):    no
unrar_open: Recovery record present:                        no
unrar_open: Block headers are encrypted:                    no
unrar_open: First volume (set only by RAR 3.0 and later):   no
unrar_open: Opened archive: /tmp/20221017_170830-scantemp.724eec9309/clamav-9b9b8a67f2af36d469896030552ef008.tmp
unrar_peek_file_header:   Name:          
unrar_peek_file_header:   Directory?:    0
unrar_peek_file_header:   Target Dir:    0
unrar_peek_file_header:   RAR Version:   1
unrar_peek_file_header:   Packed Size:   562954281943296
unrar_peek_file_header:   Unpacked Size: 844429342540032
LibClamAV debug: RAR: , crc32: 0x20001, encrypted: 0, compressed: 33554688, normal: 117440768, method: 0, ratio: 1
LibClamAV debug: CDBNAME:CL_TYPE_RAR:562954281943296::562954281943296:844429342540032:0:1:131073:(nil)
LibClamAV debug: RAR: scansize exceeded (initial: 4294967295, consumed: 17861624, needed: 844429342540032)
LibClamAV debug: FP SIGNATURE: 5cfe6676f14ea871e882b5b09d90d86f:5408828:Heuristics.Limits.Exceeded.MaxScanSize  # Name: n/a, Type: CL_TYPE_RAR
LibClamAV debug: FP SIGNATURE: b0bf027ed5fb5541f18a743d82074f14:17861624:Heuristics.Limits.Exceeded.MaxScanSize  # Name: node_exporter, Type: CL_TYPE_ELF
LibClamAV debug: Heuristics.Limits.Exceeded.MaxScanSize: scanning may be incomplete and additional analysis needed for this file.
LibClamAV debug: RAR: Next file is too large (844429342540032 bytes); it would exceed max scansize.  Skipping to next file.

I don't have a fix in mind.

ccwienk commented 2 years ago

@micahsnyder : thanks for taking a look - I also figured there would be some heuristics that - erroneously - guessed it found some kind of archive.

A workaround that would be acceptable for my personal use-case might be to have an option to disable certain extractors (RAR in this case) - of course, this will likely not be a good option for everyone.

Maybe, extractors could be deactivated for certain filetypes (e.g. ELF)?