WICG / serial

Serial ports API for the platform.
https://wicg.github.io/serial/
Other
255 stars 46 forks source link

Buffer issue #164

Open james-portman opened 2 years ago

james-portman commented 2 years ago

Hi,

I have seen similar reported but I have strange issues with a CDC USB device.

If I do not set the bufferSize when opening the port then it will work for a while but it will very consistently fail upon having transferred so much data back and forth, data will stop being received. It seems like this is related to how much data has been transferred, with it consistently happening at the same point of a "conversation" between the two devices.

I have tried setting bufferSize to 1,000 and 10,000 and this seems to resolve the issue.

The thing that doesn't make sense is that very small amounts of data are being pushed to/from the device at a time - max 64 bytes. After sending data I wait for the reply, so it's not like anything is building up and actually filling even a small buffer.

Is there any way to debug this further?

The same "conversation" works perfectly in python using pyserial. It seems to work fine with any buffer size of 1,000 or more set using web serial in Chrome.

Very confused. Thanks

reillyeon commented 2 years ago

The default buffer size (255 bytes) is small enough to cause issues where, if the device sends too much data, the USB CDC driver will throw it away if the browser doesn't read it quickly enough because it has already filled the buffer with data that has been read from the operating system but not read by the page. It is the size of this buffer, between the OS and the page, that the bufferSize option controls.

The first useful data point to collect is what operating system and USB CDC driver is being used. I've been able to reliably reproduce this kind of issue only on macOS, where I think a combination of a driver quirk that makes dropping data particularly easy combines with a process scheduling issue that causes the browser to stop trying to read long enough for that to happen. I believe the reason why libraries like pyserial don't have this problem is that they are reading data from the operating system synchronously, while the Web Serial API has to integrate with the rest of the browser's asynchronous I/O framework and that introduces a lot of opportunities for jank.

The other useful data point is to look at the data that is received and compare it to the data sent by the device to see if a pattern emerges. You can use a tool like Wireshark to collect traces at the USB level to see what data got from the device to the USB driver but failed to get to your application.

It is interesting that this issue appears to manifest after a period of time rather than manifesting when a particularly large amount of data is being received. That behavior reminds me of an issue we used to have on Windows where a race condition in the code that waited for notification of new data being available to read caused us not to notice the state transition and so the connection could lock up completely.

If it is possible to build a minimized reproduction case (for example, an Arduino sketch which responds in a similar way to the device you are communicating with) that would be incredibly helpful.

james-portman commented 2 years ago

Hi Reilly,

I always forget wireshark can do USB, sure that's a good idea.

It is strange because this is definitely not filling even the default buffer at any point, there is something like 64 bytes absolute maximum going in either direction at a time, and all comms is done as a conversation so either end sends data and waits for reply/next command.

I think rather than it taking a certain amount of time, it is more a case that it was after a certain number of bytes have been transferred, it was roughly 1KB total that had gone back and forth.

This is on Linux so may have similar issues to macOS in terms of the driver.

I have been held up by this so I am behind on my work - I need to catch up with that. I will try to come back with more solid examples and very specific repeatable issue.

If nothing else then I think it would be good if there was some way of checking the state of the buffer, or if it could alarm when full?

reillyeon commented 2 years ago

From your description it almost seems like the buffer isn't actually filling up but something like the scenario I mentioned used to happen on Windows is happening and we stop noticing that there is more data available to read. Linux and macOS share the same userspace code for reading from a serial port (mostly) but the drivers are completely different of course.

I'm sorry this has caused so much trouble. If you can describe the steps to reproduce this then hopefully the issue will turn out to be obvious when I get a debugger attached to the browser. If this is a device we can purchase somewhere (for a reasonable price) and you can share the code then we might not need a totally minimized test case.