Verify Baud Rate - Githubissues

rsaxvc commented 3 years ago

http://oscopetutorial.com/cricut/index.php?title=Explored says that the baud rate isn't 200k, it's 198.347k - if so, fixing this may address some of our comm link problems. Currently we do a 170ms delay on startup, 1ms delay per byte, and 170ms delay for shutdown. But if this wiki is right, that may not be needed. Need to get inside a unit and measure the TX coming out of the Cricut CPU towards the FTDI RX.

citelao commented 3 years ago

I've been using 198347 in my own (currently private) TypeScript/Node.JS driver, and was having even more reliability issues (I could only send about one move command every 40ms).

I was writing this up and I decided to actually read your writeup—1ms delay per byte. On a whim I added that delay to my driver and—wow. All the errors went away. I've been able to run 100 move commands in a row, with ~600 units traversal in both X and Y, with no sleeps between commands (just waiting for the 5 byte response from the cutter)—with no errors.

So 198347 may very well be the secret sauce :)

rsaxvc commented 3 years ago

Oh wow! I'm currently trying to pick up another 1st gen unit off Craigslist so I can take a look at the timings.

rsaxvc commented 3 years ago

200000/198347 is only a .8% difference, which usually seems good enough, but how important that is is up to the microcontroller on the device. And depends on the ATMega128 clock source - if the microcontroller isn't doing 200k exactly, the mismatch might actually be higher.

citelao commented 3 years ago

I spent yesterday messing with this—I'm not versed enough in the nitty-gritty of serial port communication to fully grok what's happening, but it indeed looks like the microcontroller is getting overwhelmed when I send it too much.

Specifically, I noticed some things that may be helpful:

Most notably, the 1ms sleep doesn't seem to be enough for real-world work.
- (On both Mac and Windows), I can send 1000 "version" instructions, while simply waiting 1ms between bytes and waiting for a response from the machine before sending another message.
- I can also send 2000 "move" commands, back and forth, with similar success.
- But (at least on my Mac, which has faster timings; see below), when I try to interpret some GCode (with a simple program I wrote), the cutter reliably chokes when I do my first or second "big" moves (3 inches or so, rather than millimeters at a time). Interestingly, it seems to fail after the second cut after moving—so it moves, makes its first cut, and then chokes when I send the second command.
- I cannot distill this down into a reliable repro, not for lack of trying :D
Running on my (Node) code on Windows with a 1ms sleep between bytes ran significantly slower than on Mac; I think the kernel is just slower to relinquish control on Windows. I don't have a measurement, but the difference was night-and-day: for cutting 1000s of lines of tiny GCode moves, between usable on my Mac to agonizing on my Windows.
Sending 2 bytes at a time works OK on Windows (haven't tried on Mac).
Sending 4 bytes at a time starts to choke the machine—I'll send a bunch and eventually the device will stop responding. I haven't narrowed it down.
- I verified that my Windows computer at least thinks it's sending data by attaching Wireshark to the port. It gets sent, but the cutter just stops responding eventually.
I have no real way of verifying that my device is actually attempting a 198,347 Baud.
I have to wait considerably longer after running the start command—I'm holding at 1s now but haven't super-tested.
- I now understand those commands, BTW: if the user presses "STOP" on the cutter while a transaction has started (e.g. we sent a start command), the machine will stop performing any sent commands it receives (but it will still respond!) until the machine gets the stop command.
When the machine chokes, I can typically unfreeze it by sending ~15 0x00 bytes, then a few stop commands. Sometimes it unfreezes after a few stop commands only; sometimes the null bytes are helpful; sometimes pressing the "STOP" button on the device helps.
- The cutter often starts responding at the same time that it spits out a 5-byte response, which looks similar to the same response it gives to move commands—suggesting to me that the machine is dropping a byte somewhere, and is waiting for a "complete" command to be sent.

rsaxvc commented 3 years ago

I think the FTDI chip uses a 3MHz clock with some(maybe down to 1/4) fractional dividers? I picked up a unit this weekend and plan to measure at some point. If we knew the crystal / clock source on the other end we could work out possible dividers on the microcontroller too.

On Sun, Apr 4, 2021, 6:07 PM Ben Stolovitz @.***> wrote:

I spent yesterday messing with this—I'm not versed enough in the nitty-gritty of serial port communication to fully grok what's happening, but it indeed looks like the microcontroller is getting overwhelmed when I send it too much.

Specifically, I noticed some things that may be helpful:

Most notably, the 1ms sleep doesn't seem to be enough for real-world work.

(On both Mac and Windows), I can send 1000 "version" instructions, while simply waiting 1ms between bytes and waiting for a response from the machine before sending another message.

I can also send 2000 "move" commands, back and forth, with similar success.

But (at least on my Mac, which has faster timings; see below), when I try to interpret some GCode (with a simple program I wrote), the cutter reliably chokes when I do my first or second "big" moves (3 inches or so, rather than millimeters at a time). Interestingly, it seems to fail after the second cut after moving—so it moves, makes its first cut, and then chokes when I send the second command.

I cannot distill this down into a reliable repro, not for lack of trying :D

Running on my (Node) code on Windows with a 1ms sleep between bytes ran significantly slower than on Mac; I think the kernel is just slower to relinquish control on Windows. I don't have a measurement, but the difference was night-and-day: for cutting 1000s of lines of tiny GCode moves, between usable on my Mac to agonizing on my Windows.

Sending 2 bytes at a time works OK on Windows (haven't tried on Mac).

Sending 4 bytes at a time starts to choke the machine—I'll send a bunch and eventually the device will stop responding. I haven't narrowed it down.

I verified that my Windows computer at least thinks it's sending data by attaching Wireshark to the port. It gets sent, but the cutter just stops responding eventually.

I have no real way of verifying that my device is actually attempting a 198,347 Baud.

I have to wait considerably longer after running the start command—I'm holding at 1s now but haven't super-tested.

I now understand those commands, BTW: if the user presses "STOP" on the cutter while a transaction has started (e.g. we sent a start command), the machine will stop performing any sent commands it receives (but it will still respond!) until the machine gets the stop command.

When the machine chokes, I can typically unfreeze it by sending ~15 0x00 bytes, then a few stop commands. Sometimes it unfreezes after a few stop commands only; sometimes the null bytes are helpful; sometimes pressing the "STOP" button on the device helps.

The cutter often starts responding at the same time that it spits out a 5-byte response, which looks similar to the same response it gives to move commands—suggesting to me that the machine is dropping a byte somewhere, and is waiting for a "complete" command to be sent.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vangdfang/libcutter/issues/20#issuecomment-813113230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACVFYCNCONRDXO6GWLEEALTHDWL7ANCNFSM42DU52NA .

rsaxvc commented 3 years ago

I also remember that windows had a goofy, goofy way to get nonstandard baud rates. Something vendor specific like d2xx? On Linux there's a cleaner way, some bit field perhaps that says, "please use the following number instead of an enum of older, standard rates". Not sure about OSX.

On Sun, Apr 4, 2021, 6:07 PM Ben Stolovitz @.***> wrote:

I spent yesterday messing with this—I'm not versed enough in the nitty-gritty of serial port communication to fully grok what's happening, but it indeed looks like the microcontroller is getting overwhelmed when I send it too much.

Specifically, I noticed some things that may be helpful:

Most notably, the 1ms sleep doesn't seem to be enough for real-world work.

(On both Mac and Windows), I can send 1000 "version" instructions, while simply waiting 1ms between bytes and waiting for a response from the machine before sending another message.

I can also send 2000 "move" commands, back and forth, with similar success.

But (at least on my Mac, which has faster timings; see below), when I try to interpret some GCode (with a simple program I wrote), the cutter reliably chokes when I do my first or second "big" moves (3 inches or so, rather than millimeters at a time). Interestingly, it seems to fail after the second cut after moving—so it moves, makes its first cut, and then chokes when I send the second command.

I cannot distill this down into a reliable repro, not for lack of trying :D

Running on my (Node) code on Windows with a 1ms sleep between bytes ran significantly slower than on Mac; I think the kernel is just slower to relinquish control on Windows. I don't have a measurement, but the difference was night-and-day: for cutting 1000s of lines of tiny GCode moves, between usable on my Mac to agonizing on my Windows.

Sending 2 bytes at a time works OK on Windows (haven't tried on Mac).

Sending 4 bytes at a time starts to choke the machine—I'll send a bunch and eventually the device will stop responding. I haven't narrowed it down.

I verified that my Windows computer at least thinks it's sending data by attaching Wireshark to the port. It gets sent, but the cutter just stops responding eventually.

I have no real way of verifying that my device is actually attempting a 198,347 Baud.

I have to wait considerably longer after running the start command—I'm holding at 1s now but haven't super-tested.

I now understand those commands, BTW: if the user presses "STOP" on the cutter while a transaction has started (e.g. we sent a start command), the machine will stop performing any sent commands it receives (but it will still respond!) until the machine gets the stop command.

When the machine chokes, I can typically unfreeze it by sending ~15 0x00 bytes, then a few stop commands. Sometimes it unfreezes after a few stop commands only; sometimes the null bytes are helpful; sometimes pressing the "STOP" button on the device helps.

The cutter often starts responding at the same time that it spits out a 5-byte response, which looks similar to the same response it gives to move commands—suggesting to me that the machine is dropping a byte somewhere, and is waiting for a "complete" command to be sent.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vangdfang/libcutter/issues/20#issuecomment-813113230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACVFYCNCONRDXO6GWLEEALTHDWL7ANCNFSM42DU52NA .

rsaxvc commented 3 years ago

We may also need to handle device disconnection better. I bet if a program dies halfway through sending a request, the next request will probably not be understood. We might need to send a few(2-3) version requests until we hear something back.

rsaxvc commented 3 years ago

I haven't played with it in a while(XP?), but IIRC, Windows used to delay up to the nearest OS scheduler tick. Ex: Sleep(1) became Sleep(1 to 15).

I think you can disable that by cooking off some battery life with https://docs.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod

rsaxvc commented 3 years ago

Crud, looks like mingw's libc doesn't have those functions. Not sure if we could extern declare them or if linking the EXE would fail still.

citelao commented 3 years ago

I don't have any experience with mingw. I wonder how hard it would be to switch to standard WinRT, though—I think libcutters UNIX dependencies are mostly just fopen?

rsaxvc commented 3 years ago

That might do it.

The only hokey thing is that libcutter needs a variable baud rate serial port. Currently the windows build uses Win32 for that.

I wonder if WinRT supports the OS timer rate adjustment APIs above?

citelao commented 3 years ago

I thought I responded to this, but:

I'm pretty sure you can interop Win32 and WinRT code. The days of UWP-only are over.

Unfortunately all of the overviews I can find are very VS-focused, but I'm pretty sure you can just add some headers and go. I think you may need to link a lib?

https://docs.microsoft.com/en-us/windows/uwp/cpp-and-winrt-apis/get-started#modify-a-windows-desktop-application-project-to-add-cwinrt-support https://kennykerr.ca/2019/01/25/getting-started-with-xlang-and-cppwinrt/

Rust manages to make it one-shot: https://blogs.windows.com/windowsdeveloper/2020/04/30/rust-winrt-public-preview/

citelao commented 3 years ago

Yeah, we probably need to link to winmm.lib and/or consume winmm.dll.

rsaxvc commented 3 years ago

Kate's expression running 2.00 doesn't seem to work with either 200000 or 198347 - at most each returns a few null bytes, so I'm going to guess that older cricuts don't have the needed commands available.

rsaxvc commented 3 years ago

Ok, from http://www.built-to-spec.com/blog/wp-content/uploads/2010/02/Bottom-off.jpg Cricut Expression appears to have a 16MHz crystal running the ATMega128, which supports two possible async modes controlled by U2X bit:

Async normal mode, U2X = 0, where baud = Fosc / 16 / (UBRR+1)
Async double speed mode, U2X = 1, where baud = Fosc / 8 / (UBRR+1)

Assuming double-speed mode, I think these are the nearby options

UBRR	BAUD
8	222222
9	200000
10	181818
11	166667

Not sure what this means. I believe it means the device could not speak 198347 baud, though the FTDI transceiver could. If the FTDI chip is configured for 200kbaud operation and the device is speaking 181818 baud, it would be a 10% error, which would be quite significant, but maybe it could work with inter-byte delays, but just barely.

vangdfang / libcutter

Verify Baud Rate #20