showrav017 / jain-sip

Automatically exported from code.google.com/p/jain-sip
0 stars 0 forks source link

Continues Overflow in SSLStateMachine when unwrapping #163

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
NOTE: we have not been able to reproduce and as such we do not know if our 
suggested fix actually would work. For now, we really just want your input and 
let you guys be aware that there is an issue with how the engine is handling 
buffer overflows. So far, it has taken out one node in production but since 
this is on TLS, we do not have any pcaps.

Symptom: 100% CPU and TLS Port unresponsive. Log flooded with "Buffer overflow 
, must prepare the buffer again. Check for continious overflow here?"

A note on the log. In the version of the jain sip library we were running, this 
was logged on INFO but now seems to have changed to DEBUG. We obviously do not 
run with DEBUG log on in production.

The line where this comes from is in SSLStateMachine.unwrap(ByteBuffer src, 
ByteBuffer dst) and it really seems that the log line is a TODO since it 
actually says "Check for continious overflow here?". We suspect that this is 
what actually hit us. I.e., there were continuous overflows and that took the 
SSLStateMachine out of play and the machine too since it ended up in an 
infinite loop (just a theory based on the log message flooding and 100% CPU). A 
suggestion is to e.g. do:

1. Add a max overflow limit:

private static final int OVERFLOW_LIMIT = 10;

2. Have a counter for each instance of the state machine:

private int overflowCount = 0;

3. And then simply count:

if(result.getStatus().equals(Status.BUFFER_OVERFLOW)) { 
if(logger.isLoggingEnabled(LogWriter.TRACE_DEBUG)) { 
logger.logDebug("Buffer overflow , must prepare the buffer again. Check for 
continious overflow here?"); 
} 
if (++this.overflowCount > OVERFLOW_LIMIT) { 
throw new SSLException("SSL unwrap: too much overflow"); 
} 
dst = channel.prepareAppDataBuffer(); 
continue; 
} else { 
this.overflowCount = 0; 
}

or something...

Btw, just a suggestion as well:

result.getStatus().equals(Status.BUFFER_OVERFLOW)

could lead to a NPE if result.getStatus() would return null (bug or whatever)

Status.BUFFER_OVERFLOW.equals(result.getStats()) would never blow up.

Original issue reported on code.google.com by jean.deruelle on 19 Jun 2015 at 2:57

GoogleCodeExporter commented 8 years ago
I think the issue is that the overflow is not handled at all and keep looping 
on a buffer of the same size. So your fix will make sure that the CPU doesn't 
get to 100% as it will exit after a number of tries but will not solve the 
issue of accepting the big packet I think which will lead to lost calls or call 
setup.

I think a better fix would be to increase the buffer size and retry such as the 
code snippet below

if(result.getStatus().equals(Status.BUFFER_OVERFLOW)) {
if(logger.isLoggingEnabled(LogWriter.TRACE_DEBUG)) {
logger.logDebug("Buffer overflow , must prepare the buffer again."
+ " outNetBuffer remaining: " + dst.remaining()
+ " outNetBuffer postion: " + dst.position()
+ " Packet buffer size: " + sslEngine.getSession().getPacketBufferSize()
+ " new buffer size: " + sslEngine.getSession().getPacketBufferSize() + 
dst.position());
}
ByteBuffer newBuf = 
channel.prepareAppDataBuffer(sslEngine.getSession().getPacketBufferSize() + 
dst.position());
dst.flip();
newBuf.put(dst);
dst = newBuf;
if(logger.isLoggingEnabled(LogWriter.TRACE_DEBUG)) {
logger.logDebug(" new outNetBuffer remaining: " + dst.remaining()
+ " new outNetBuffer postion: " + dst.position());
}
continue;
}

Original comment by jean.deruelle on 19 Jun 2015 at 2:58

GoogleCodeExporter commented 8 years ago
https://telestax.atlassian.net/browse/JSIP-27 & 
https://java.net/jira/browse/JSIP-502

Original comment by jean.deruelle on 19 Jun 2015 at 3:00

GoogleCodeExporter commented 8 years ago
This issue was updated by revision 5229e474fec7.

JSIP-502

Fix + Fix for JSIP-483
(cherry picked from commit 181e3ea87ca39cc9a840276e3beade7bc8055c84)

Original comment by jean.der...@telestax.com on 19 Jun 2015 at 3:05

GoogleCodeExporter commented 8 years ago

Original comment by jean.deruelle on 19 Jun 2015 at 3:06