Closed satyamshekhar closed 11 years ago
dang! We should sanitize all user sent attributes before we even start processing them.
:).. totally.. however when I fixed this a lot reports were being sent continuously.. since two response are sent simultaneously (initially) and the client is never able to recover. We should do this with a timeout.
didn't get this bit...
actually there was another bug.. https://github.com/dhruvbird/node-xmpp-bosh/blob/master/src/session.js#L975 -- I changed line from if (node.attrs.ack < this.max_rid_sent && this.unacked_responses[node.attrs.ack]) to if (node.attrs.ack < this.max_rid_sent && this.unacked_responses[node.attrs.ack + 1]), which I know realise is not correct either. response corresponding to rid=node.attrs.ack will always be deleted in handle acks. So this will never evaluate to true. We need to decide when should we send a report and without how much timeout.
I don't see what the problem is??
hey, so I am not able to understand what the problem is! could you please elaborate for the benefit of the challenged!!
On Thu, Feb 9, 2012 at 10:51 AM, satyamshekhar reply@reply.github.com wrote:
typo: to if (node.attrs.ack < this.max_rid_sent && this.unacked_responses[node.attrs.ack + 1])
Reply to this email directly or view it on GitHub: https://github.com/dhruvbird/node-xmpp-bosh/issues/2#issuecomment-3889678
-Dhruv Matani. http://dhruvbird.com/
"What's the simplest thing that could possibly work?" -- Ward Cunningham
I am really sorry for not being clear. I have this tendency.
Check out the change I made here https://github.com/dhruvbird/node-xmpp-bosh/commit/9f0b359c4349d301a15e67d5fefc58a371b120e2#L0L977 (though this is a mistake). Lets consider the code as it was before this change. We have three issues -
This line is wrong since we are checking for this.unacked_responses[node.attrs.ack]. This will always be false since while handling acks we delete from unacked responses the response for which we have received an ack (all response for rid <= node.attrs.ack).
-- Also, we should see if there is a certain amount of time has pass before we enqueue a report. For example -
Probably its an extreme edge case - and also the cost is only one extra response being sent from the server in case it has nothing to send.
Let's try to reason this out.
Does that make sense?
On Thu, Feb 9, 2012 at 3:16 PM, satyamshekhar reply@reply.github.com wrote:
I am really sorry for not being clear. I have this tendency.
Check out the change I made here https://github.com/dhruvbird/node-xmpp-bosh/commit/9f0b359c4349d301a15e67d5fefc58a371b120e2#L0L977 (though this is a mistake). Lets consider the code as it was before this change. We have three issues -
- First is sanitation of all the input attrs of a request, since we never make node.attrs.ack an integer. (wondering how/why does acking work correctly)
- bug in the code: https://github.com/dhruvbird/node-xmpp-bosh/commit/9f0b359c4349d301a15e67d5fefc58a371b120e2#L0L973
This line is wrong since we are checking for this.unacked_responses[node.attrs.ack]. This will always be false since while handling acks we delete from unacked responses the response for which we have received an ack (all response for rid <= node.attrs.ack).
- Even if we correct the above two issues, it still wont work properly for clients that dont have acks, since for such clients we assume the ack to rid - window. Due to this assumption we will end up sending a report everytime (since ack < max_rid_sent).
-- Also, we should see if there is a certain amount of time has pass before we enqueue a report. For example - 1. server is holding r1. 2. server sends something on r1. 3. before receiving r1 the client sends something on r2. 4. server sends a report for r1 (r1 wont be acked).
Probably its an extreme edge case - and also the cost is only one extra response being sent from the server in case it has nothing to send.
Reply to this email directly or view it on GitHub: https://github.com/dhruvbird/node-xmpp-bosh/issues/2#issuecomment-3894767
-Dhruv Matani. http://dhruvbird.com/
"What's the simplest thing that could possibly work?" -- Ward Cunningham
Right, so what I had done was a mistake - will correct.
That means for a client with no acks we will end up sending a lot of reports, since we assume that for such a client ack = rid - window. Maybe we should disable acks for such clients.
Its just that at times there will an extra RTT incurred since we sent a report when it was not required. Shouldn't we wait for some time out (~5 secs: > RTT/2) before sending a report?
hmm... yes, for non-ack clients, we'll land up sending a report every time.
It makes sense to just have a flag which check if acks are enabled for this client and not do anything if acks aren't enabled (session.ack should reflect this).
As far as sending a report is concerned, we can do it immediately. To save traffic, we can wait for 1 rid instead of a difference of 0. i.e. If the client has ACKed 1 less than the current, we can live with it. Introducing a delayed event would probably complicate things since a lot of state would need to be updated if the ACK does come in within that window. (trying to keep it simple) - what do you think?
I was thinking about it differently - Cant we send a report only if the time we sent the prev response is greater than a threshold (~5secs)?
Dunno if that's a good idea since potentially many packets can be exchanged within that time frame. Plus, the point of a report is to let the client know sooner than later about a potentially missed response... Basically, I'm trying to stay away from anything that involves "guessing" a good constant/time value since that is where things tend to go - esp. when there is a decent deterministic solution.
I know what you mean.. Cool..
Also, the timestamp we send in the report is the timestamp for the acked response. Shouldn't it be timestamp for the least rid unacked response?
"In this case it SHOULD include a 'report' attribute set to one greater than the 'ack' attribute it received from the client, and a 'time' attribute set to the number of milliseconds since it sent the response associated with the 'report' attribute."
I am guessing that "the response associated with the 'report' attribute" is the same as ack from client + 1.
Ya.. so I think we were sending timestamp for node.attrs.ack - the change currently in the master is the change I made thinking we should check for node.attrs.ack + 1 instead of node.attrs.ack -- which you corrected.
The situation is like this: i) We check if we have node.attrs.ack in our unacked_responses - so that we dont keep sending the same report again and again. ii) We check if we have node.attrs.ack + 1 in our unacked_responses - so that we can to send time stamp for that.
So, now this brings us to the case where we dont have node.attrs.ack + 1 in our unacked responses(the request got lost) but max_rid_sent > node.attrs.ack. How do we handle this?
Something doesn't seem right here.
Suppose we just responded on rid=11. The previous rid=10. Before our response (rid=11) reached the client, it ACKed rid=10 on (request) rid=12.
The check node.attrs.rid(10) < max_rid_sent(11) will succeed and a report will be sent out. This is because we are processing acknowledgments after the reports are sent out. Doesn't this seem wasteful?
Also, do you think it's right to process ACKs ''after'' handling the 'report sending?
After giving this some more thoughts, these are my observations:
Do you think this makes more sense?
Consider the following situation:
If we have a check for rid < max_rid_sent - 1
I think the check (acked) rid < max_rid_sent was correct. The only problem with this is that in an ongoing chat, we might end up enqueing a lot of reports - and thus degrading our performance -- since a lot requests will be exchanged,
The check unacked_responses[rid + 1] or unacked_response[rid] seemed similar to me initially but they are not. Ideally, we should have these methods(enqueue_report and handle_acks) independent of each other -- their order should not matter. We achieve this by using node.attrs.ack + 1. Another, consequence of using rid + 1 is that we might send multiple reports for the same rid, if rid + 1 is not acked. However, if we use the check unacked_responses[rid] it forces us to send only one report.
The ideal solution to the problem would be to immediately send a report if we receive a request, which doesn't ack a response that we had sent after even after RTT secs (its median) -- this can be many times. Otherwise, we dont enqueue any report. We need the check for unacked_responses for misbehaving clients(in case it has already acked it before).
max_rid_sent is 11, ack is 10. -- we should send a report for this.
According to the spec, we should. Do we have any numbers on how often this happens for well-behaved clients with a good network? (false negatives?)
I agree with you about the fact that their order should not matter. However, this is true if we use the node.attrs.ack+1 logic instead of keying on node.attrs.ack since as you mentioned if we use the latter, and we delete ACKed responses before handling 'report', we shall never find the unacked response.
I guess it is "okay" to send out multiple reports (barring performance). I mean it's not a correctness issue. In fact, it is "correct" to send multiple reports. The reason being that the first report might be lost.
According to the spec, we should. Do we have any numbers on how often this happens for well-behaved clients with a good network? (false negatives?)
No numbers yet, but are reports meant for good networks. Shouldn't they optimize bad networks without hindering performance for good networks?
I am still biased towards the timeout solution :). Though, I also get why you think its not at all clean.
Maybe we need to create another "report queue" to queue up all report-type messages. Reports can be clubbed with normal responses. Send a report along with a normal response if one is to be sent?
We en-queue reports with normal responses right now as well. Why do you think a separate queue is required?
We can (potentially) club:
Currently, we don't distinguish between [2] & [3].
@satyamshekhar This is almost fossilized. Is there anything actionable that came out of this? Closing for now. Re-open if needed.
I found a bug yesterday, https://github.com/dhruvbird/node-xmpp-bosh/blob/master/src/session.js#L975 always evaluates to false. No, reports are ever sent -- since node.attrs.ack is not an integer.