Open BrianPeek opened 9 years ago
Hello Achal, Brian, I just found the same issue using Desktop. I have observed the 'invalid' key fits in the list of keys and expected to be near the end. So I'm guessing a timing problem: the report arrives before the response item added to the collection. Is it possible? I confirm, the Responses dictionary is huge and growing. It can be a mistake of mine, of course.
As result of this exception the brick stops sending replies, since it breaks the while-loop in UsbCommunication.ConnectAsync(). The brick still executes commands, but Brick.PollSensorsAsync() always returns with error because of timeout in WaitForResponseAsync. Disconnect/reconnect needed.
This issue was imported from CodePlex
achalshah wrote 2014-09-01 at 21:41 I've been getting exceptions of type System.Collections.Generic.KeyNotFoundException from ResponseManager.HandleResponse. From my reading of the code, this means that a response with a specific sequence number was not found in the Responses collection. Have you seen anything like this - any idea how this could happen?
Also, while I was trying to understand this code, one example of the path looks like this :
The questions:
Is the behavior in 4 above normal in the error case, i.e. the response is not removed?
Is it possible for there to be a race condition between WaitForResponseAsync timing out on WaitOne and the response being handled?
I still can't explain the exception that I'm seeing, but is it possible that there is a race between the polling task and some other command (like one to apply power to the motors), such that both threads try to call ResponseManager.CreateResonse at the same time? Is incrementing the sequence number atomic or exclusive?
Thanks,
-Achal
peekb wrote 2014-09-10 at 13:15 Is this happening on a specific platform? (Phone, Desktop, WinRT?)
achalshah wrote 2014-09-11 at 00:39 This is on Phone. Let me know if you want the stacks.
achalshah wrote 2014-09-13 at 00:46 Here is a possible scenario resulting in non-sequential write access to the Responses collection (in ResponseManager):
PollSensorAsync() in Brick.cs sends commands and waits on the response being handled (eventually by calling WaitForResponseAsync on the ResponseManager. WaitForResponseAsync waits on the response event.
PollInput in BTCommunication runs in a separate thread and calls HandleResponse in ResponseManager when new data comes in.
HandleResponse gets the sequence, takes it from the Responses collection, populates it and sets its event which WaitResponseAsync is waiting for. Once the event is signalled, WaitResponseAsync removes the response from the Responses collection.
Even though these 2 threads are pretty much synchronized by virtue of the fact that each command has a response which is waited on, a 3rd thread could send a command (e.g. from the UI), and now we could have two tasks waiting for responses.
The two responses would come back to back and HandleResponse could set both events before the two tasks waiting on the events are able to run. Eventually both run and try to remove items from the collection simultaneously and could corrupt it since access is not protected.