Closed stepan-romankov closed 2 years ago
We have a Java client using clamav-client 2.0.2 and ClamavClient.scan(java.io.InputStream)
to scan files from a blob store over HTTP. The input stream is an okhttp3 response body stream.
It seems like clamav-client
is only sending the first couple chunks of the input stream to clamav, causing false-negatives for e.g. an EICAR sample in the middle of the file.
This is apparent from using e.g. tshark
to look at the TCP streams while scanning a multi-MB PDF file. The TCP client is sending the 00 00 00 00
end-of-INSTREAM
marker after only several KB: tshark-1.log
$ sudo tshark -i ... -f 'tcp port 3310' -O data -z conv,tcp
... see attached tshark-1.log
================================================================================
TCP Conversations
Filter:<No Filter>
| <- | | -> | | Total | Relative | Duration |
| Frames Bytes | | Frames Bytes | | Frames Bytes | Start | |
10.x.y.2:39900 <-> 10.x.y.38:dyna-access 6 415 6 8123 12 8538 28.564316889 0.0353
I am fairly sure that this stream truncation is caused by the faulty while (chunkSize == CHUNK_SIZE) { ... }
logic. I think the InputStream.read(...)
will return as soon as it has at least one byte available, and there is no guarantee that it will wait for the entire given buffer to be filled before returning. The InputStream.read(...)
returning an incomplete buffer is thus not a sign of reaching EOF. The loop should continue until InputStream.read(...)
returns -1
for EOF.
Fixed in 2.1.0
The way of checking end of stream is not correct. The right way is to check if it's "-1". https://github.com/cdarras/clamav-client/blob/50c438cc8d0b91cb891df68d220c9fec47cd0d34/src/main/kotlin/xyz/capybara/clamav/commands/scan/InStream.kt#L32