teemuatlut / TMCStepper

MIT License
479 stars 188 forks source link

TMC2208 connection issues after v0.3.2 #24

Open teemuatlut opened 5 years ago

teemuatlut commented 5 years ago

Documenting here for others to see.

There seems to be some connection issues with TMC2208 with at least AVR and DUE platforms. The likely commit to cause the issues is d27257900a14ee8cd8df7d5c9b01f1f60c3c4d0b. @gloomyandy Any thoughts? I'll try reverting things bit by bit and see what the offending line is but I'll have to revert at least some of your PR and make a bugfix release.

Current solution is to use v0.3.1 until the issue is fixed.

gloomyandy commented 5 years ago

Can you provide details of the problem? I've not seen any reports of issues on the Marlin github. The thing is that reverting the changes will almost certainly cause problems for users of some LPC176x boards. I was certainly regularly seeing issues before I made that change. I'd like to understand what the actual problem is, because as far as I can see that code should work at least as well as the older code.

teemuatlut commented 5 years ago

Unfortunately not yet 😄 I'll have to scope the communication lines to see what's going on. Yesterday I was having issues with SW UART on AVR and I couldn't figure out what was wrong. Which was kinda embarrassing for me to have comms issues -_- But reverting to v0.3.1 fixed and the registers where reported with sane data.

It sometimes is that when fixing something you break something else. But I have to prioritize AVR over LPC. The end goal absolutely is having them both work properly.

gloomyandy commented 5 years ago

Sure I understand that, but there are now a lot of folks using the SKR boards with TMC2208s and I'd hate to see all of them start hitting problems. I suspect that the issue will be that there can be spurious characters picked up when the TMC2208 switches from input to output (and output to input). The old code in effect used the last 8 characters read as a response, the new code will use the first 8 characters. From what I remember I only saw the character very infrequently so it is tricky to test and even harder to capture on a scope.

A setup I found very useful was to dump the contents of the reply packet if there was a checksum error. It is then pretty easy to spot if a character has been inserted or dropped.

Does the problem only show up with software serial or does hardware UART have a problem as well?

gloomyandy commented 5 years ago

One thought on the AVR how big is the RX buffer? With the new code it will need to be able to hold at least 13 bytes (5 from the echo of the command and 8 in the response packet). I think with the old code you may have been able to get away with a smaller buffer.

teemuatlut commented 5 years ago

Can you tell me why the flush was moved to after replyDelay?

This seems to fix the comms issue on AVR: Or at least the response is not 0

--- a/src/source/TMC2208Stepper.cpp
+++ b/src/source/TMC2208Stepper.cpp
@@ -83,10 +83,9 @@ uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uin

        while (serPtr.available() > 0) serPtr.read(); // Flush
        for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
-       // allow time for a response
-       delay(replyDelay);
        if (full_duplex)
                for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written
+       delay(replyDelay);

        // read 8 byte response packet
        for(int byte = 0; byte < 8; byte++) {
gloomyandy commented 5 years ago

Because if the uart has a buffer associated with it the characters may not have actually been written by the time that the flush operates (they are still in the buffer), so obviously the "echo" will not be there to be flushed at that stage. What this means is that depending how fast the cpu is, and the baudrate etc. there is a good chance that the flush will not read all of the "echoed" characters from the command packet. With the old code you will probably be ok because the read loop will keep reading characters and will just keep the last 8 bytes read.

What is really needed is a call to the uart code to wait until the tx buffer is empty, but unfortunately we don't have one of those. Though that is another story it turns out that there is a lot of confusion around exactly what the flush method should do. On early versions of the Arduino libs it discarded input data, but now it does exactly what we want and waits for the output to drain. Unfortunately Marlin seems to be using the old definition for flush and instead has added a flushTX (which does what we need), but flushTX is not available via the "Stream" interface/class so can not easily be called. All a bit of a mess really!

ghost commented 5 years ago

I actually tried your updated TMCstepper driver at the time when you updated it @teemuatlut on my AVR board with the TMC2208's and found it didn't work, it mentioned something about a zero byte at the end, but not sure ? .. so I went back to the one that normally gets used.

I do use my own modified SoftwareSerial for AVR though as that allows for single pin UART on my AVR board (MKS Gen 1.4).

As is, your current driver and @gloomyandy modified SoftwareSerial drive (single pin UART) works good on my SKR 1.3 board. Although I'd much prefer to use the hardware UART and a multiplexer for the 2208 comms (much more reliable due to the int problem).

gloomyandy commented 5 years ago

I have a nasty feeling that if you just move the delay, then you may still have problems with anything that is using a hardware UART (as that is likely to have a TX buffer).

gloomyandy commented 5 years ago

The thing I'm really puzzled about is why it doesn't work with the delay before the flush. Any thoughts as to what is going on? As far as I can tell the Arduino SoftwareSerial should have a 64 byte rx buffer and so all of the data should be in the buffer waiting to be read after the delay. I don't think Marlin changes the value of _SS_MAX_RX_BUFF or does it?

@teemuatlut I assume that the TMC code is just using the standard Arduino SoftwareSerial library https://github.com/arduino/ArduinoCore-avr/tree/master/libraries/SoftwareSerial or does it use something else?

teemuatlut commented 5 years ago

That's the one. Whatever library the platform provides.

teemuatlut commented 5 years ago

Changing the order helps SW Serial but doesn't fix HW Serial. This is still on AVR.

gloomyandy commented 5 years ago

Is there any chance you can dump out the bytes that the flush is reading in both cases? I'm wondering if for some reason the AVR is not seeing one or more of the echoed bytes. If that is the case then delaying the read would mean that some of the actual reply data would end up being read and discarded.

The AVR SoftwareSerial write routine disables interrupts for the entire write of all of the bits. I wonder if that is screwing up reception of the echoed characters.

Yes I'm not surprised that hardware serial is still broken. I think you will be hitting the buffering issues I described above.

ghost commented 5 years ago

The thing I'm really puzzled about is why it doesn't work with the delay before the flush. Any thoughts as to what is going on? As far as I can tell the Arduino SoftwareSerial should have a 64 byte rx buffer and so all of the data should be in the buffer waiting to be read after the delay. I don't think Marlin changes the value of _SS_MAX_RX_BUFF or does it?

oooo I don't know, I've not delved into the Marlin side of the code yet, only the AVR SoftwareSerial to make it single-pin UART.

I'll have to have a nosey to try and see what might be happening, being as I have an AVR board etc. I've been learning the Marlin code as I go (as you've seen ;)), their is a lot of it to get ones head around.

gloomyandy commented 5 years ago

I think ideally the code should look like this....

    uint64_t out = 0x00000000UL;

    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
    serPtr.flush(); // Make sure all TX data has been written
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written
    // allow time for a response
    delay(replyDelay);

    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }
    return out;

But this will not work with Marlin because we have the definition of flush screwed up.

teemuatlut commented 5 years ago

I'm testing outside of Marlin.

#include <TMCStepper.h>

#define EN_PIN    38  // Enable
#define DIR_PIN   55  // Direction
#define STEP_PIN  54  // Step
#define SW_RX     63  // TMC2208/TMC2224 SoftwareSerial receive pin
#define SW_TX     40  // TMC2208/TMC2224 SoftwareSerial transmit pin

#define R_SENSE 0.11

// Select your stepper driver type
//TMC2208Stepper driver = TMC2208Stepper(&Serial1, R_SENSE); // Hardware Serial0
TMC2208Stepper driver = TMC2208Stepper(SW_RX, SW_TX, R_SENSE); // Software serial

void setup() {
  Serial.begin(9600);
  //Serial1.begin(115200);
  driver.beginSerial(115200);
  pinMode(EN_PIN, OUTPUT);
  pinMode(STEP_PIN, OUTPUT);
  pinMode(DIR_PIN, OUTPUT);
  digitalWrite(EN_PIN, HIGH);

  Serial.println(driver.DRV_STATUS(), HEX);
}

void loop() {}

The libraries have no access to Marlin code, for better or worse. That's why it uses Stream instead of HardwareSerial.

gloomyandy commented 5 years ago

Yep but unfortunately when part of Marlin the Hardware serial case will be using a MarlinSerial class which has the "wrong" flush in it. It does have flushTX but that can not be accessed from Stream. All so frustrating! We could easily change the SoftawareSerial flush (as that is part of the framework), but getting the flush in Marlin changed might be a little more tricky.

But anyway is there any chance we can at least confirm that adding the call to flush in your test case means that both AVR hardware and software works?

If possible I'd also like to confirm that the AVR SoftwareSerial case is dropping characters. That way at least we understand what is going on.

ghost commented 5 years ago

is the replyDelay the number of milliseconds before the TMC2008 begins to transmit it's reply ?

I'd do something like this myself if so ..

    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
/*  // allow time for a response
    delay(replyDelay);
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written
*/
        uint32_t ms = getMS() + replyDelay;
        while (getMS() < ms)
        serPtr.read(); // Flush bytes written

    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {

I don't know how to get a current system ms tick value so I put getMS() in it's place.

teemuatlut commented 5 years ago

Yes. Here's what the datasheet says about it

In order to ensure a clean bus transition from the master to the slave, the TMC22xx does not immediately send the reply to a read access, but it uses a programmable delay time after which the first reply byte becomes sent following a read request.

gloomyandy commented 5 years ago

replyDelay is a period of time that gives the TMC2208 long enough to both switch to output mode and send the response.

The actual delay between end of the command and the response is way smaller than replyDelay.

ghost commented 5 years ago

Going to look at the TMC2008 datasheet.

gloomyandy commented 5 years ago

@doggyfan the code you posted would discard all of the response as well as the echoed command.

From the data sheet...

 This delay time can be set in multiples of
eight bit times using SENDDELAY time setting (default=8 bit times) according to the needs of the
master.

So by default the delay is 8 bit times which at 115000 baud is 69uS (I think). replyDelay is set to 5mS.

ghost commented 5 years ago

oh sorry, okey doke.

So the replyDelay value is just a maximum time to hang around for for a reply from the TMC2208 ?

gloomyandy commented 5 years ago

That period is pretty small to time accurately in normal code (due to possible interrupts etc.) and to make matters worse we don't know when to actually start the timing because we have no way of knowing when then last byte of the command was actually sent (because the sent data may be in a TX buffer and we can't call flush).

ghost commented 5 years ago

OK, and with you.

ghost commented 5 years ago

As a start, if it were me, those 4 don't care bits in the TMC SYNC byte I'd set to all 1's for sending frames to the TMC chip, because the TMC chip zero's those 4 bits which you can use to detect and differenciate a Tx SYNC byte from an Rx SYNC byte.

And on receive, I'd scan and ignore any and all Rx bytes until that first SYNC byte with those don't care bits zero'd .. you then know you're synch'ed properly.

So my code would look something like this ..

/*  // allow time for a response
    delay(replyDelay);
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written
*/
    uint32_t ms = getMS() + replyDelay;
    for (int byte = 0, sync = 0; byte <= len && getMS() < ms; )
    {
        int16_t res = serPtr.read();
        if (res < 0)
            continue;
        if (!sync)  // wait for sync
            if ((res & 0xff) == TMC2208_SYNC)
                sync = 1;   // found the sync byte
        if (sync)
        {   // sync found .. save bytes
            out <<= 8;
            out |= res & 0xFF;
            byte++;
        }
    }
/*
    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }
*/

But anyway ..

teemuatlut commented 5 years ago

Finally got the logic analyzer set up with these. Here's setting toff = 2. Meaning one write to CHOPCONF (0x6C) image

gloomyandy commented 5 years ago

Yes that's my understanding of replyDelay. After that period the data should be sat there ready to read.

In the full duplex case there should be five bytes of echoed command data and then 8 bytes of response. In the half duplex case we will not have the 5 echoed bytes.

But it is looking to me as if in the case of the AVR SoftwareSerial we can not be certain that all of the echoed bytes will have been read (I would really like to have this confirmed), so we can't just read all of the data and discard the first 5 bytes. I think that is why the old code used to in effect read all of the bytes available and just use the last 8 bytes as the reply. In effect it synced on the end of the data.

Now we could go back to doing something like that again, but we have another issue. When using half duplex we sometimes get an extra character at the end of the 8 bytes of response data. This is the caused by the TMC2208 switching back to input mode. The good news is that we only get that glitch when in half duplex (because in full duplex we have the extra pull up from the TX line via the TX resistor).

So I think that is the problem we are dealing with.

ghost commented 5 years ago

oops, took too long in editing my last message (added some example code).

gloomyandy commented 5 years ago

At the moment I think there are a few possible solutions....

  1. Sort out the flush mess and use the code I posted above. I think that will handle all cases.
  2. Change the code to basically do what it used to do in the full duplex case, but have it basically delay and then read the fixed length response in the half duplex case.
  3. Change the code to discard bytes looking for the sync byte and then read it and the next 7 bytes. As mentioned above we may need to modify the transmitted data to avoid mistaking an echoed byte for the sync byte. We can actually look for the byte after the sync byte to check that it is 0xff as well which should make it pretty robust.
ghost commented 5 years ago

I'd personally go with your 3. @gloomyandy

So a Tx SYNC byte would be 'TMC2208_SYNC | 0xf0' and an Rx SYNC byte would be 'TMC2208_SYNC'

ghost commented 5 years ago

add "| 0xf0" to the TX sync byte

void TMC2208Stepper::write(uint8_t addr, uint32_t regVal) { uint8_t len = 7; addr |= TMC_WRITE; uint8_t datagram[] = {TMC2208_SYNC | 0xf0, TMC2208_SLAVE_ADDR, addr, (uint8_t)(regVal>>24), (uint8_t)(regVal>>16), (uint8_t)(regVal>>8), (uint8_t)(regVal>>0), 0x00};

and

uint32_t TMC2208Stepper::read(uint8_t addr) { uint8_t len = 3; addr |= TMC_READ; uint8_t datagram[] = {TMC2208_SYNC | 0xf0, TMC2208_SLAVE_ADDR, addr, 0x00}; datagram[len] = calcCRC(datagram, len); uint64_t out = 0x00000000UL;

ghost commented 5 years ago

It will at least give you a good Rx SYNC byte to scan for and lock too with the ability to ignore your Tx frames, and for sure as you say look for that 0xff byte in the Rx frame too.

ghost commented 5 years ago

plus you don't have to hang around for the 5ms .. once you find your Rx frame, exit the function with the Rx'ed bytes.

gloomyandy commented 5 years ago

Yes though changing that timing may have other consequences! OK I'm going to give this code a try and see how it looks. I'm sure there will be some corner cases to sort out!

ghost commented 5 years ago

just shift each Rx'ed byte into the 64-bit 'out' variable looking for '(out & 0xffff) == (TMC2208_SYNC | 0xff) .. that will always and easily lock you onto any Rx frame and ignore all Tx frames.

ghost commented 5 years ago

This would be my version ..

template<typename SERIAL_TYPE>
uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uint16_t replyDelay, bool full_duplex) {
    uint64_t out = 0x00000000UL;
/*
    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);

    // allow time for a response
    delay(replyDelay);
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written

    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }

    return out;
*/

    constexpr uint8_t  TMC2208_SYNC = 0x05;

    // flush the Rx buffer
    while (serPtr.available() > 0)
        serPtr.read();

    // send the Tx frame
    for (int i = 0; i <= len; i++)
        serPtr.write(datagram[i]);

    // scan for the rx frame
    uint32_t ms = millis();
    int byte = -1;
    while (byte < 8)
    {
        uint32_t ms2 = millis();
        if (ms2 != ms)
        {   // 1ms tick
            ms = ms2;
            if (--replyDelay <= 0)
                break;  // out of time
        }

        int16_t res = serPtr.read();
        if (res < 0)
            continue;
        out = (out << 8) | (res & 0xff);
        if (byte < 0)
        {   // waiting for the Rx sync pattern
            if ((out & 0xffff) == (((uint16_t)TMC2208_SYNC << 8) | 0xff))
                byte = 2;   // found the Rx sync pattern
        }
        else
            byte++;
    }

    return (byte >= 8) ? out : 0;
}
teemuatlut commented 5 years ago

(AVR HW UART) Reading DRV_STATUS with v0.3.1. Result = 0xC0100000 image

Same sketch code with master. Result = 0x0 image

So at least now I can be sure the problem with reading the data out of Serial buffer.

Btw, your calculated delay was quite accurate. The scoped is approximately 63us.

teemuatlut commented 5 years ago

This works with both HW and SW serial:

@@ -85,11 +85,8 @@ uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uin
        for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
        // allow time for a response
        delay(replyDelay);
-       if (full_duplex)
-               for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written

-       // read 8 byte response packet
-       for(int byte = 0; byte < 8; byte++) {
+       while(serPtr.available() > 0) {
                int16_t res = serPtr.read();
                if (res >= 0) {
                        out <<= 8;
ghost commented 5 years ago

is off to try it :)

gloomyandy commented 5 years ago

Ok so I have a version of @doggyfan code working but there is a bit of a problem.

ghost commented 5 years ago

I'm not suprised @gloomyandy, I've not yet tried it.

Do tell :)

ghost commented 5 years ago

I've not used my AVR board for a sometime, it's been hidden away all on it's own, having trouble at the moment with it, just trying to find out what the problem is.

ghost commented 5 years ago

(AVR HW UART) Reading DRV_STATUS with v0.3.1. Result = 0xC0100000 image

Same sketch code with master. Result = 0x0 image

So at least now I can be sure the problem with reading the data out of Serial buffer.

Btw, your calculated delay was quite accurate. The scoped is approximately 63us.

That looks like a very good serial communication @teemuatlut !

gloomyandy commented 5 years ago

OK so here is my version of the code (I had to make a few changes to get things to build etc.). https://github.com/gloomyandy/TMCStepper/tree/TMC2208_PR It basically seems to work on my SKR V1.3 board. However the problem is that on boot I always get a "TMC Connection error". But if I run M122 later on everything is fine. The problem is that during boot there is all sorts of things happening and the timed loop is basically not running for long enough to read all of the data (basically it is getting interrupted while running), so it will often only read say 6 or 7 bytes.

ghost commented 5 years ago

OK, so it's working, but not at boot-up ... hmmmm

I don't yet know Marlin well enough to know what that could be @gloomyandy

Any ideas ?

I'm just having a play with you're version on my MKS Gen 1.4 (AVR).

gloomyandy commented 5 years ago

In the old code we called delay and if that got interrupted it gave us longer to read the data (sort of anyway), but now if we get an interrupt that takes a while after we sample the clock then we basically stop too soon.

@doggyfan I noticed you changed the timeout mechanism after I grabbed your code, I suppose that version may be more resilient to this sort of problem? What do you think? I'll give it a try!

ghost commented 5 years ago

The code I wrote isn't yet working on my AVR board, your shorter change was though. Just investigating.

ghost commented 5 years ago

hmmm, as you say, a boot-up problem at the moment. But got my code to work too and the AVR :) .. and yours :)

ghost commented 5 years ago

So, no matter what, the function must not exit until the 5ms is up, even if it gets the reply much quicker ?

gloomyandy commented 5 years ago

Ah OK the problem is that basically the 5mS in replyDelay is just not long enough for the LPC1768 software serial. We force the baud rate to be 19.2K to reduce the interrupt load and to give us the best chance to detect the bits. But at 19.2K 12 bytes with 10bits per byte needs 6mS to read and we only have 5! I guess we have been getting away with that without problems with the old code because of extra delays etc. If I increase the delay I think it will be fine.

I'm surprised there is an error on the AVR I thought it used a much faster baud rate even with software serial. I'm using your modified timing routine, if anything it makes things worse (though it may be more accurate!).

I'm just going to try it with a 10mS period.