solita / clamav-java

Simple ClamAV client for streaming data to clamd server
GNU Lesser General Public License v2.1
106 stars 46 forks source link

Fixed ping method not checking number of bytes successfully read #4

Closed drogin closed 8 years ago

drogin commented 8 years ago

Should fix the issue #3

lokori commented 8 years ago

I believe this is a better, more proper way: https://github.com/lokori/clamav-java/commit/1e2b1542a58c12065c2673598cb9aa6ee55f78ce

There are essentially four different error situations related to InputStream.read:

  1. it's a blocking operation.
  2. no bytes available, EOF (return code -1)
  3. less than expected four bytes available.
  4. more than expected four bytes available.

Compared to PR, both handle 3 properly. Neither handles 1. The commit handles 2 correctly, while PR seems to block indefinitely.

The commit doesn't care about extra bytes, and "PONGabijfiqoij" is accepted, while PR would seem to accept "aasdPONG" as a proper reply. Neither are perfectly correct, but the issue in the original code is mostly about situations 2 & 3.

drogin commented 8 years ago

Regarding case 3, the PR would not hang, becuase if no bytes available or EOF, -1 would be returned and stored in the variable readResult. Then the do-while check would run: (readResult > 0) which is equal to (-1 > 0), which would equal false, and the loop would end.

regarding 3 and 4, its not much of a problem in my opinion, as we store the reply of the server in its entirety and the compare will return false if the string isnt an exact match. Although stoping to read the server reply from the client side could arguably be a performance enhancement, I feel its cleaner to read the entire server reply(which should've been 4 in any case).

The suggested alternative to the PR might fail in some cases even when the server does reply "PONG". Consider a scenario where the server replies "PONG", but the client reads this in 2 separate read-calls, due to network-issues, buffering etc. Read1 = "PO" , read2 = "NG". The PR would handle this, while the alternative code would not. It's unlikely though.

lokori commented 8 years ago

Yes, I was wrong - PR handles the case 3 fine. And actually both behave similarly for case 4. A reply "PONGextrabonus" would be interpreted as a valid reply and only four bytes would be read from the inputstream in this PR too. After the first round in the loop, the "len" parameter passed to read is zero and nothing more will be read from the stream, though there are bytes available still.

I actually tested this, didn't rely anymore on my code reading skills as they are obviously lacking :-) Here's the test code: `
try {

String ss = "PONGpongo";
InputStream inputStream = new ByteArrayInputStream(ss.getBytes(StandardCharsets.UTF_8));
byte[] b = new byte[4];
int copyIndex = 0;
int readResult;
do {
  readResult = inputStream.read(b, copyIndex, Math.max(b.length - copyIndex, 0));
  copyIndex += readResult;
} while (readResult > 0);
System.out.println("yea " + new String(b));

} catch (Exception e) { e.printStackTrace(); }`

So in the end they behave identically except in the odd situation where PONG reply gets cut. I'm not sure if that can actually happen with TCP sockets, but I suppose it's theoretically possible if the packet size is forced artificially small. Given that the TCP packet headers take up 40 bytes, splitting four bytes of data to two packets seems very unlikely unless explicitly forced. But I'm not expert on network protocols.

drogin commented 8 years ago

Hehe, indeed! I'll admit the PR code isn't the most readable. However I couldn't find any less compelx way of reading the bytes while accounting for the case where PONG gets cut.

lokori commented 8 years ago

It's a matter of opinion whether this is "less complex", but I think this code is equivalent and uses just one variable:

` byte[] b = new byte[4];

        int copyIndex = 0;
        do {
            copyIndex += inputStream.read(b, copyIndex, b.length - copyIndex);
        } while ((copyIndex < 4) && (copyIndex > 0));` 
drogin commented 8 years ago

I agree that this one is a lot easier to read. However, imagine a scenario where 2 reads are done: First read gets "PO". Copy index is now 2 Then the second read returns -1, because for some reason the server didnt send the rest. Then copyIndex would now be 1, and the loop would continue even though -1(or 0) would be returned.

I have thought quite a bit about this, hehe ;)

lokori commented 8 years ago

True.. Anyway I merged your code, which seems to work properly 👍