gnea / grbl

An open source, embedded, high performance g-code-parser and CNC milling controller written in optimized C that will run on a straight Arduino
https://github.com/gnea/grbl/wiki
Other
4.04k stars 1.61k forks source link

Recovering from EMI #1238

Closed sandmanRO closed 9 months ago

sandmanRO commented 9 months ago

I'm working on a setup that works ok in every single respect except that occasionally, most likely due to EMI, the grbl 1.1 based controller stops sending any feedback. However, it still responds to commands but it sends back nothing, like querying for status '?' returns nothing although a jog ($J) or home ($H) or any other command for that matter is still executed just fine. The only way to get out of this is to reopen the port. The issue with reopening the serial port is that this automatically resets the controller, which means you lose position so you have to start all over again, with homing and so on. I know, I know, better shielding for USB cable and so on, but still, is there anything that could be done to recover from this, other than resetting the controller? I was looking over the grbl's serial code but I could not find anything that could lead to such behavior, when apparently the grbl's TX turns dead / off. Any input on this would be greatly appreciated.

lalo-uy commented 9 months ago

It looks tome like the serial to USB converted Rx get blocked. Can you change it?

El El lun, 18 dic. 2023 a la(s) 12:15, sandmanRO @.***> escribió:

I'm working on a setup that works ok in every single respect except that occasionally, most likely due to EMI, the grbl 1.1 based controller stops sending any feedback. However, it still responds to commands but it sends back nothing, like querying for status '?' returns nothing although a jog ($J) or home ($H) or any other command for that matter is still executed just fine. The only way to get out of this is to reopen the port. The issue with reopening the serial port is that this automatically resets the controller, which means you lose position so you have to start all over again, with homing and so on. I know, I know, better shielding for USB cable and so on, but still, is there anything that could be done to recover from this, other than resetting the controller? I was looking over the grbl's serial code but I could not find anything that could lead to such behavior, when apparently the grbl's TX turns dead / off. Any input on this would be greatly appreciated.

— Reply to this email directly, view it on GitHub https://github.com/gnea/grbl/issues/1238, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXBW4JPXW7VYT5GDJ7FIQTYKBMY5AVCNFSM6AAAAABAZV63AWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2DMOBZGE4TEMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sandmanRO commented 9 months ago

Hello @lalo-uy, thank you for the input. I'm using a woodpecker controller so probably it would be easier to change the whole controller than the serial to usb chipset embedded on that controller :-). The issue is, there is no way of knowing for sure if indeed the TX part on that chipset got somehow locked (when they do, usually these chipsets lock on both TX and RX simultaneously not only on TX) or the issue resides on grbl's TX side. I was hoping there is something in the grbl code I could try first.

lalo-uy commented 9 months ago

You could look at the TX pin of the atmel cpu, to see if it transmit when locked.

El El lun, 18 dic. 2023 a la(s) 13:46, sandmanRO @.***> escribió:

Hello @lalo-uy https://github.com/lalo-uy, thank you for the input. I'm using a woodpecker controller so probably it would be easier to change the hole controller than the serial to usb chipset embedded on that controller :-). The issue is, there is no way of knowing for sure if indeed the TX part on that chipset got somehow locked (when they do, usually these chipsets lock on both TX and RX simultaneously not only on TX) or the issue resides on grbl's TX side. I was hoping there is something in the grbl code I could try first.

— Reply to this email directly, view it on GitHub https://github.com/gnea/grbl/issues/1238#issuecomment-1861020832, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXBW4NPTZJHO37ZESWPBWTYKBXMTAVCNFSM6AAAAABAZV63AWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRRGAZDAOBTGI . You are receiving this because you were mentioned.Message ID: @.***>

sandmanRO commented 9 months ago

True, sort of...when atmel TX is inactive could mean the grbl serial code is stuck or for some reason the data register empty interrupt is not firing out. Either way, knowing that precisely would not help me much as it would not fix the issue.

sandmanRO commented 9 months ago

All right. I managed to get this going by modifying the serial_write() and ISR(SERIAL_UDRE) in serial.c and adding a custom real-time command that "resets" the grbl's serial TX (this was kind of tedious as you have to do changes in at least four files, config.f, system.c, protocol.c and serial.c). Oh well, it's not perfect as I lose some characters in the process (usually the first message that comes in after recovery is truncated/ corrupted...e.g. a status message would look like ": :|FS:0,0> etc) but I can live with that as the very next message and all the other next are ok. There appears to be a wired deadlock condition in the grbl's TX code. If for whatever reasons (EMI or whatever) the tail remains behind head exactly one TX buffer, you enter a deadlock. ISR(SERIAL_UDRE) is turning off the data register empty interrupt (so the tail is no longer advancing) and in the same time serial_write() is waiting indefinitely for the tail to advance before updating the TX buffer, increment the head and re-enable the data register empty interrupt which allows the tail to advance again (see the while condition in serial_write()). With the standard grbl1.1 code, the only way to get out of this deadlock is via reset. As far as I am concerned the issue is closed.