teemuatlut / TMCStepper

MIT License
481 stars 189 forks source link

TMC2208 connection issues after v0.3.2 #24

Open teemuatlut opened 5 years ago

teemuatlut commented 5 years ago

Documenting here for others to see.

There seems to be some connection issues with TMC2208 with at least AVR and DUE platforms. The likely commit to cause the issues is d27257900a14ee8cd8df7d5c9b01f1f60c3c4d0b. @gloomyandy Any thoughts? I'll try reverting things bit by bit and see what the offending line is but I'll have to revert at least some of your PR and make a bugfix release.

Current solution is to use v0.3.1 until the issue is fixed.

ghost commented 5 years ago

I seem to have it working now on the AVR.

I fixed an error in my code ..

if ((out & 0xffff) == ((uint16_t)(TMC2208_SYNC << 8) | 0xff))

too

if ((out & 0xffff) == (((uint16_t)TMC2208_SYNC << 8) | 0xff))

ghost commented 5 years ago

This is now what I have ..

    constexpr uint8_t  TMC2208_SYNC = 0x05;

    // flush the Rx buffer
    while (serPtr.available() > 0)
        serPtr.read();

    // send the Tx frame
    for (int i = 0; i <= len; i++)
        serPtr.write(datagram[i]);

    // scan for the rx frame
    uint32_t ms = millis();
    int byte = -1;
    while (byte < 8 && replyDelay > 0)
    {
        uint32_t ms2 = millis();
        if (ms2 != ms)
        {   // 1ms tick
            ms = ms2;
            replyDelay--;
        }

        int16_t res = serPtr.read();
        if (res < 0)
            continue;
        out = (out << 8) | (res & 0xff);
        if (byte < 0)
        {   // waiting for the Rx sync pattern
            if ((out & 0xffff) == (((uint16_t)TMC2208_SYNC << 8) | 0xff))
                byte = 2;   // found the Rx sync pattern
        }
        else
            byte++;
    }

    // hang around until we're expected to leave
    while (replyDelay > 0)
    {
        uint32_t ms2 = millis();
        if (ms2 != ms)
        {   // 1ms tick
            ms = ms2;
            replyDelay--;
        }
    }

    return (byte >= 8) ? out : 0;
ghost commented 5 years ago

mmm you're right, it's a problem at boot-up but if I do an M122 afterwards it reports back as OK.

hmmm

gloomyandy commented 5 years ago

I've just updated my code, with 10mS it seems to work OK.

No exiting early is fine if you have the data. The problem is this. Imagine that every time round the loop there have been sufficient interrupts and delays so that with your code each time you test the time at least 1mS has passed, with a timeout of 5 you only get to read 5 bytes!

BTW I'm a little confused that your code is changing replyDelay I thought that was a constant class variable?

ghost commented 5 years ago

no it's a not constant on entering the function @gloomyandy, it lets me count it down.

It's just a local variable in the function.

gloomyandy commented 5 years ago

My original code is probably even worse. If between setting the timeout and testing it at the start of the loop there is an interrupt that takes 5mS then no bytes get read at all!

ghost commented 5 years ago

10ms then it is :) just exit once you've grabbed the Rx frame.

Just trying it on my AVR and SKR board.

gloomyandy commented 5 years ago

Hmm that's confusing because there is a class variable with the same name that is a constant...

    static constexpr uint8_t replyDelay = 10;

In TMCStepper.h. Oh well.

ghost commented 5 years ago

Yes, but you pass that to the function we've been editing, and that changes it to local variable just in the function.

You need to specify it as 'const' in the function like this ..

uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, const uint16_t replyDelay, bool full_duplex)

gloomyandy commented 5 years ago

Yes I know, I just don't think it is very good practice...

@teemuatlut What do you think to this modification? I'd like to test things some more and clean up the code a little.

ghost commented 5 years ago

Yes you are right, it's not good practice. I didn't notice you have it as a const in the .h file.

I can't get my code to work on my AVR at the moment. If yours is good to go then go for it :)

gloomyandy commented 5 years ago

Hmm that's not good, any idea what is causing the problem? You can add debug output if you need to, it is not pretty but it works. My version has some, sorry about the absolute path to the header file!

gloomyandy commented 5 years ago

Does my version work on the AVR? I can only easily test on 32bit boards.

teemuatlut commented 5 years ago

I think you're getting a bit too excited :P The code I posted earlier seemed to restore functionality to the same level as before and at least appeared to work on AVR. I did read about your notion about the LPC SW Serial needing a longer delay so perhaps that should be addressed as well. So something I may try is removing the delay and comparing the read bytes against the sync byte. Basically try to find the start of the response. This should work with different platforms and different delays. Maybe something like this but with a timer compare to abort out of the while loops.

template<typename SERIAL_TYPE>
uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uint16_t replyDelay, bool full_duplex) {
    uint64_t out = 0x00000000UL;

    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);

    int16_t res = 0;

    // Find SYNC nibble
    do { res = serPtr.read(); } while (res != TMC2208_SYNC);
    out = res;
    out <<= 8;
    // Find master address
    do { res = serPtr.read(); } while (res != 0xFF);
    out |= res;
    out <<= 8;

    while(serPtr.available() > 0) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }
    return out;
}
ghost commented 5 years ago

right den, just compiled and uploaded your code (as per here) to my AVR board. It fails at boot-up but if I do an M!22 it's OK. Though like my code, it shows the correct values in the LCD TMC config menu.

    uint64_t out = 0x00000000UL;

    while (serPtr.available() > 0)
        serPtr.read(); // Flush

    for (int i = 0; i <= len; i++)
        serPtr.write(datagram[i]);

    // allow time for a response
    delay(replyDelay);

    while (serPtr.available() > 0)
    {
        int16_t res = serPtr.read();
        if (res >= 0)
        {
            out <<= 8;
            out |= res&0xFF;
        }
    }

    return out;
teemuatlut commented 5 years ago

FYI; there are currently issues with the upstream TMC LCD section and the fix hasn't been merged in.

gloomyandy commented 5 years ago

@teemuatlut the problem with that approach (I started off looking at something very similar) is when you have something like the sync byte followed by some other value, then another sync byte and 0xff, I don't think your code will correctly cope with that. It also starts to get very messy when you add in the timeout stuff.

Anyway I've updated my version of the code (which is really pretty much @doggyfan code) and pushed it to github. For me on an SKR V1.3 it is working fine with the longer timeout.

I think it is a pretty good solution, but will need testing on AVR with both hardware and software serial.

teemuatlut commented 5 years ago

Actually you could tweak a bit and have a 24b pattern that needs to match with SYNC - 0xFF - REG_ADDRESS. I think after that you should be pretty sure you're looking at the right response from your target.

gloomyandy commented 5 years ago

@teemuatlut the code you posted earlier will almost certainly have occasional errors on an LPC176x board using half duplex (so the SKR V1.1 and others). It will probably pick up an extra character every now and again due to a spurious byte generated when the TMC2208 switches from output to input. I explained it a bit more above.

If you want to go that route can I suggest the following...

        for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
        // allow time for a response
        delay(replyDelay);
        if (full_duplex) {
                while(serPtr.available() > 0) {
                        int16_t res = serPtr.read();
                        if (res >= 0) {
                                out <<= 8;
                        }
                 }
        } else {
                for(int i=0; i<8; i++) {
                        int16_t res = serPtr.read();
                        if (res >= 0) {
                                out <<= 8;
                        }
                 }
        }
ghost commented 5 years ago

That's what I was doing @teemuatlut .. I was looking for a 16-bit sync pattern - (SYNC << 8) | 0xff, and then save that and the following 6 bytes with a timeout.

teemuatlut commented 5 years ago

Does the "spurious byte" typically occur before the driver starts sending valid data (sync nibble)? Then I think this byte would just get discarded when the valid data pushes it out of the 24b space.

gloomyandy commented 5 years ago

@doggyfan do you have code that works at boot time on the AVR? I'm not sure if the version of my code you tried had the increased timeout in it. With the updated timeout (10mS) and code I no longer see problems with the SKR V1.3.

ghost commented 5 years ago

I changed the 5ms to 10ms in the .h file @gloomyandy. So should be at 10ms timeout now.

Just playing with the code to see what works etc on the AVR.

gloomyandy commented 5 years ago

@teemuatlut no it happens after the end of the response data when the TMC2208 switches from output to input. Normally it will get discarded by the flush performed before a write. But with code like this...

                while(serPtr.available() > 0) {
                        int16_t res = serPtr.read();
                        if (res >= 0) {
                                out <<= 8;
                        }
                 }

every now and again (and it is timing dependent because there is small gap between the end of the valid data and the switch to input mode), the above code will read an extra byte and shuffle the good data down by 8 bits. In the code I posted above I only read 8 bytes when in half duplex mode so no chance of reading the spurious one.

The problem is only seen when using half duplex. When using full duplex the output line and resistor acts to pull up the signal and so you do not get a glitch.

This should not be a problem when using the approach we have been looking at here to detect the sync byte, only with the solution you posted earlier.

ghost commented 5 years ago

I'm using software serial on the AVR full duplex at the moment.

ghost commented 5 years ago

I'm going to have to come back to this tomorrow as it's getting late here in the UK, time for bed.

nitghty night. back tomorrowz

teemuatlut commented 5 years ago

Ah I see. Thanks for explaining. So what you could then do is find the sync+0xFF+addr pattern, then read the 32b datagram + 8b CRC and that would be treated as the valid data. Then finally flush the serial with .available() in case there is any garbage.

Anyway, 1AM. I'm out ->

gloomyandy commented 5 years ago

Also in the UK and getting late. Final update is that I've just been testing software serial in half duplex mode on the SKR V1.3 with the code I pushed earlier and it seems to be working fine. I will try and get my SKR V1.1 board into a working state (I need to modify some drivers for it), so that I can test any code on that. That configuration is probably a better test of half duplex mode.

Yes that set of operations is pretty close to what I've been testing does, but with only a 16 bit sync packet and without the flush at the end. Probably a good idea to do that flush, just in case, though the flush before we do a write should take care of anything left over.

Night all.

ghost commented 5 years ago

Well, this scans for the 16-bit Rx sync pattern (SYNC + 0xff) and saves that along with the following 6 bytes with the timeout in the loop.

It's 100% on my SKR 1.3 board, but having trouble on my MKS Gen 1.4 board (AVR), don't yet know why but no TMC UART routine (mine or @gloomyandy) is working properly on my AVR board at the moment. It was fine a couple of weeks ago but Marlin has changed a lot since.

edit: oops, posted old version, updated now

template<typename SERIAL_TYPE>
uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uint16_t replyDelay, bool full_duplex)
{
/*
    uint64_t out = 0ul;

    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
    // allow time for a response
    delay(replyDelay);
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written

    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }
    return out;
*/

    // two byte Rx sync pattern
    constexpr uint16_t rx_sync = 0x05ff;

    // 8-byte Rx'ed and returned frame
    uint64_t out = 0;

    // flush the Rx buffer
    while (serPtr.available() > 0)
        serPtr.read();

    // send the Tx frame
    for (int i = 0; i <= len; i++)
        serPtr.write(datagram[i]);

    // scan for the Rx frame
    uint32_t tick_ms = millis();
    int16_t ms_left = replyDelay;   // timeout time in ms
    int rx_byte_count = -1;
    while (rx_byte_count < 8 && ms_left >= 0)
    {
        if (tick_ms != millis())    // unaffected by a millis() roll-over
        {   // 1ms tick
            tick_ms++;
            ms_left--;  // timer count down
        }

        int16_t res = serPtr.read();
        if (res < 0)
            continue;
        out = (out << 8) | (res & 0xff);
        if (rx_byte_count < 0)
        {   // waiting for the Rx sync pattern
            if ((uint16_t)out == rx_sync)
                rx_byte_count = 2;  // found the Rx sync pattern
        }
        else
            rx_byte_count++;
    }

//  SERIAL_ECHOLNPAIR("TMC_RX: ", out);

    return (rx_byte_count >= 8) ? out : 0;
}
ghost commented 5 years ago

I'll take it over to the scope shortly and see exactly what's happening on the serial line itself and post the capture here.

gloomyandy commented 5 years ago

@doggyfan it would be very useful if you could add debug code to your AVR version and dump out exactly what is being read as well as having any hardware trace. We really need to see that to understand what is going on.

This version https://github.com/gloomyandy/TMCStepper/blob/b22c6bd2a07001ae9d8b6c70805935db6bc3b240/src/source/TMC2208Stepper.cpp of my code includes an example of adding debug prints to the code. I'm not sure how easy it is on the AVR to arrange to have a terminal connected at boot time though.

What problem are you seeing on the AVR? Is it just at boot time you get a fail or do all M122s fail?

gloomyandy commented 5 years ago

I'm not around for the next few hours, but will hopefully be able to help later on today. Good luck!

ghost commented 5 years ago

The AVR doesn't have SERIAL_PRINTF(), it's been a headache learning what the AVR has and hasn't .. grrrr

I currently have this ..

// debug
#include "/Projects/3D Printer/Marlin 2/mine/Marlin/src/core/serial.h"

template<typename SERIAL_TYPE>
uint64_t _sendDatagram(SERIAL_TYPE &serPtr, uint8_t datagram[], uint8_t len, uint16_t replyDelay, bool full_duplex)
{
/*
    uint64_t out = 0x00000000UL;

    while (serPtr.available() > 0) serPtr.read(); // Flush
    for(int i=0; i<=len; i++) serPtr.write(datagram[i]);
    // allow time for a response
    delay(replyDelay);
    if (full_duplex)
        for(int byte=0; byte<=len; byte++) serPtr.read(); // Flush bytes written

    // read 8 byte response packet
    for(int byte = 0; byte < 8; byte++) {
        int16_t res = serPtr.read();
        if (res >= 0) {
            out <<= 8;
            out |= res&0xFF;
        }
    }
    return out;
*/

    // two byte Rx sync pattern
    constexpr uint16_t rx_sync = 0x05ff;

    // 8-byte Rx'ed and returned frame
    uint64_t out = 0;

    // flush the Rx buffer
    while (serPtr.available() > 0)
        serPtr.read();

    // send the Tx frame
    for (int i = 0; i <= len; i++)
        serPtr.write(datagram[i]);

    // scan for the Rx frame
    uint32_t tick_ms = millis();
    int16_t ms_left = replyDelay;   // timeout time in ms
    int rx_byte_count = -1;
    while (rx_byte_count < 8 && ms_left >= 0)
    {
        if (tick_ms != millis())    // unaffected by a millis() roll-over
        {   // 1ms tick
            tick_ms++;
            ms_left--;  // timer count down
        }

        int16_t res = serPtr.read();
        if (res < 0)
            continue;
        out = (out << 8) | (res & 0xff);
        if (rx_byte_count < 0)
        {   // waiting for the Rx sync pattern
            if ((out & 0xffff) == rx_sync)
                rx_byte_count = 2;  // found the Rx sync pattern
        }
        else
            rx_byte_count++;
    }

    // debug
    SERIAL_EOL();
    SERIAL_ECHOPGM("TMC_RX ");
    SERIAL_ECHOPAIR(" ms_left:", ms_left);
    SERIAL_ECHOPAIR("  ", rx_byte_count);
    SERIAL_ECHOPAIR(", ", (uint8_t)(out >> 56));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >> 48));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >> 40));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >> 32));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >> 24));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >> 16));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >>  8));
    SERIAL_ECHOPAIR(" ", (uint8_t)(out >>  0));
    SERIAL_EOL();

    return (rx_byte_count >= 8) ? out : 0;
}
ghost commented 5 years ago

It's so very close to working, just get the odd error in the odd rx frame. The routine is faultless really now, it's totally unaffected by any leading and/or trailing spurious bytes, if a valid frame is received within the 10ms time limit it will see it, it's now totally unaffected by any leading and/or trailing spurious bytes. It's the software serial routine on the AVR that's supplying bad data that's at fault by the looks of it.

I added a bit more debugging where the CRC is checked to show CRC-ERROR or CRC-OK.

On the SKR 1.3 board everything is 100% fine.

ghost commented 5 years ago

Interestingly, your CRC checking routine clears the frame as being valid if it's all zero's.

ghost commented 5 years ago

This is from my SKR 1.3 board (perfect) ..

Sometimes the Rx frame takes 5ms to finish, sometimes 6ms.

M122 output ``` M122 X Y Z Z2 Enabled TMC_RX ms_left:5 8, 5 255 6 32 0 3 65 26 CRC-OK false TMC_RX ms_left:5 8, 5 255 6 32 0 3 65 26 CRC-OK false TMC_RX ms_left:4 8, 5 255 6 32 0 3 65 26 CRC-OK false TMC_RX ms_left:4 8, 5 255 6 32 0 3 65 26 CRC-OK false Set current 1000 1000 1000 1000 RMS current TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 994 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 994 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 994 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 994 MAX current TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1402 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1402 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1402 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1402 Run current 17/31 17/31 17/31 17/31 Hold current 8/31 8/31 8/31 8/31 CS actual TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 8/31 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 8/31 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 8/31 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 8/31 PWM scale TMC_RX ms_left:5 8, 5 255 113 0 0 0 10 246 CRC-OK 10 TMC_RX ms_left:5 8, 5 255 113 0 0 0 10 246 CRC-OK 10 TMC_RX ms_left:5 8, 5 255 113 0 0 0 10 246 CRC-OK 10 TMC_RX ms_left:5 8, 5 255 113 0 0 0 10 246 CRC-OK 10 vsense TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 0=.325 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 0=.325 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 0=.325 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 0=.325 stealthChop TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK true TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK true TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK true TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK true msteps TMC_RX ms_left:4 8, 5 255 108 20 0 130 132 78 CRC-OK 16 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 16 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 16 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 16 tstep TMC_RX ms_left:5 8, 5 255 18 0 15 255 255 29 CRC-OK max TMC_RX ms_left:5 8, 5 255 18 0 15 255 255 29 CRC-OK max TMC_RX ms_left:5 8, 5 255 18 0 15 255 255 29 CRC-OK max TMC_RX ms_left:5 8, 5 255 18 0 15 255 255 29 CRC-OK max pwm threshold 29 29 38 38 [mm/s] TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 136.31 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 136.31 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 13.00 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 13.00 OT prewarn TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK false TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK false TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK false TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK false off time TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 4 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 4 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 4 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 4 blank time TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 24 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 24 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 24 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 24 hysteresis -end TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 2 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 2 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 2 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 2 -start TMC_RX ms_left:4 8, 5 255 108 20 0 130 132 78 CRC-OK 1 TMC_RX ms_left:4 8, 5 255 108 20 0 130 132 78 CRC-OK 1 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1 TMC_RX ms_left:5 8, 5 255 108 20 0 130 132 78 CRC-OK 1 Stallguard thrs DRVSTATUS X Y Z Z2 stst TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK X TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK X TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK X TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK X olb TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK ola TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK s2gb TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK s2ga TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK otpw TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK ot TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 157C TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 150C TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK 143C TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK 120C TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:4 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK s2vsa s2vsb Driver registers: TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK X 0xC0:08:00:00 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK Y 0xC0:08:00:00 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK Z 0xC0:08:00:00 TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK Z2 0xC0:08:00:00 Testing X connection... TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK OK Testing Y connection... TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK OK Testing Z connection... TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK OK Testing Z2 connection... TMC_RX ms_left:5 8, 5 255 111 192 8 0 0 94 CRC-OK OK ok ```
ghost commented 5 years ago

Right, I've added a simple retry loop (retries up to 3 times on this run) to the read function. This is what I get on my AVR ..

M122 output ``` 09:42: M122 X Enabled TMC_RX ms_left:9 8, 5 255 6 32 0 3 65 26 Retry-1 CRC-OK false Set current 1000 RMS current TMC_RX ms_left:9 8, 5 255 108 22 0 130 130 245 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 5 255 108 22 0 130 132 245 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 19 0 131 132 212 Retry-3 CRC-ERROR 994 MAX current TMC_RX ms_left:9 8, 5 255 108 22 0 129 194 245 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 19 0 131 132 213 Retry-2 CRC-ERROR TMC_RX ms_left:10 8, 5 255 108 22 0 130 132 213 Retry-3 CRC-OK 1402 Run current 17/31 Hold current 8/31 CS actual TMC_RX ms_left:10 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK 8/31 PWM scale TMC_RX ms_left:9 8, 5 255 113 0 0 0 13 243 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 113 0 0 0 5 243 Retry-2 CRC-ERROR TMC_RX ms_left:8 8, 5 255 113 0 0 0 10 246 Retry-3 CRC-OK 10 vsense TMC_RX ms_left:9 8, 5 255 108 22 0 130 132 213 Retry-1 CRC-OK 0=.325 stealthChop TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK true msteps TMC_RX ms_left:9 8, 5 255 108 19 0 130 132 213 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 27 0 129 132 213 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 129 130 210 Retry-3 CRC-ERROR 256 tstep TMC_RX ms_left:-1 7, 0 5 255 25 128 232 255 29 Retry-1 CRC-OK TMC_RX ms_left:-1 7, 0 5 255 17 0 232 255 29 Retry-2 CRC-OK TMC_RX ms_left:-1 7, 0 5 255 18 0 232 255 29 Retry-3 CRC-OK 0 pwm threshold 52 [mm/s] TMC_RX ms_left:10 8, 5 255 108 22 0 130 132 213 Retry-1 CRC-OK 76.02 OT prewarn TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 95 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 12 0 0 95 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 95 Retry-3 CRC-ERROR false off time TMC_RX ms_left:9 8, 5 255 110 19 0 130 132 213 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 5 255 108 22 0 130 132 213 Retry-2 CRC-OK 4 blank time TMC_RX ms_left:9 8, 5 255 108 27 0 129 132 213 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 129 130 210 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 110 11 0 131 132 213 Retry-3 CRC-ERROR 16 hysteresis -end TMC_RX ms_left:8 8, 5 255 108 22 0 130 132 245 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 130 132 210 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 130 132 213 Retry-3 CRC-OK 2 -start TMC_RX ms_left:9 8, 5 255 108 139 0 130 130 212 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 110 11 0 130 132 213 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 129 132 212 Retry-3 CRC-ERROR 1 Stallguard thrs DRVSTATUS X stst TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK X olb TMC_RX ms_left:10 8, 5 255 111 192 4 0 0 95 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 12 0 0 95 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-3 CRC-OK ola TMC_RX ms_left:9 8, 5 255 103 192 12 0 0 94 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 95 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-3 CRC-OK s2gb TMC_RX ms_left:9 8, 5 255 111 192 12 0 0 94 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 4 0 0 79 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 4 0 0 95 Retry-3 CRC-ERROR s2ga TMC_RX ms_left:9 8, 5 255 111 192 4 0 0 95 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-2 CRC-OK otpw TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK ot TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK 157C TMC_RX ms_left:9 8, 5 255 111 192 4 0 0 94 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 5 255 111 192 8 0 0 95 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 12 0 0 95 Retry-3 CRC-ERROR 150C TMC_RX ms_left:9 8, 5 255 111 192 4 0 0 94 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-2 CRC-OK 143C TMC_RX ms_left:10 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK 120C TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK s2vsa s2vsb Driver registers: TMC_RX ms_left:10 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK X 0xC0:08:00:00 Testing X connection... TMC_RX ms_left:9 8, 5 255 111 192 8 0 0 94 Retry-1 CRC-OK OK ok ```
ghost commented 5 years ago

The retries certainly help on the AVR, but doesn't fix the source of the problem - something interferring with the software serial RX routine.

ghost commented 5 years ago

This is the read routine that I modified to do up to 10 retries. It's a very simple addition ..

uint32_t TMC2208Stepper::read(uint8_t addr) {
    uint8_t len = 3;
    addr |= TMC_READ;
    uint8_t datagram[] = {TMC2208_SYNC | 0xf0, TMC2208_SLAVE_ADDR, addr, 0x00};

    datagram[len] = calcCRC(datagram, len);

    uint64_t out = 0;

    for (uint8_t retry = 1; retry <= 10; retry++)
    {
        #if SW_CAPABLE_PLATFORM
            if (SWSerial != NULL) {
                    SWSerial->listen();
                    out = _sendDatagram(*SWSerial, datagram, len, replyDelay, full_duplex);
                    SWSerial->stopListening();
            } else
        #endif
            {
                out = _sendDatagram(*HWSerial, datagram, len, replyDelay, false);
            }

        uint8_t out_datagram[] = {(uint8_t)(out>>56), (uint8_t)(out>>48), (uint8_t)(out>>40), (uint8_t)(out>>32), (uint8_t)(out>>24), (uint8_t)(out>>16), (uint8_t)(out>>8), (uint8_t)(out>>0)};
        if (calcCRC(out_datagram, 7) == (uint8_t)(out&0xFF)) {
            CRCerror = false;
        } else {
            CRCerror = true;
            // probably better to return nothing rather than random bad data.
            out = 0;
        }

        // debug
        SERIAL_ECHOPAIR("Retry-", retry);
        if (CRCerror)
            SERIAL_ECHOPGM("  CRC-ERROR\n");
        else
            SERIAL_ECHOPGM("  CRC-OK\n");

        if (out != 0 && !CRCerror)
            break;
    }

    return out>>8;
}
ghost commented 5 years ago

Here's the AVR start up result (retries fix it but not solve it) ..

Off topic, but I notice that the first thing Marlin does at boot-up is to interrogate the TMC2208 before doing anything else. I guess that's the intention ?

Boot up ``` 22:29: start echo:Marlin bugfix-2.0.x echo: Last Updated: 2018-01-20 | Author: (doggyfan, AVR) echo:Compiled: May 15 2019 echo: Free Memory: 1758 PlannerBufferBytes: 1456 TMC_RX ms_left:9 8, 5 255 108 22 0 130 132 213 Retry-1 CRC-OK echo:Hardcoded Defau 22:30: lt Settings Loaded echo: G21 ; Units in mm (mm) echo: M149 C ; Units in Celsius echo:Filament settings: Disabled echo: M200 D1.75 echo: M200 D0 echo:Steps per unit: echo: M92 X50.00 Y50.00 Z400.00 E84.88 echo:Maximum feedrates (units/s): echo: M203 X225.00 Y225.00 Z22.00 E400.00 echo:Maximum Acceleration (units/s2): echo: M201 X1800.00 Y1800.00 Z176.00 E3200.00 echo:Acceleration (units/s2): P R T echo: M204 P1800.00 R3200.00 T1800.00 echo:Advanced: B S T X Y Z E echo: M205 B20000.00 S0.00 T0.00 X10.00 Y10.00 Z1.00 E5.00 echo:Home offset: echo: M206 X0.00 Y0.00 Z0.00 echo:Filament Runout Sensor: echo: M412 S1 echo:Mesh Bed Leveling: echo: M420 S0 Z0.00 echo:Material heatup parameters: echo: M145 S0 H180 B70 F0 echo: M145 S1 H240 B110 F0 echo:PID settings: echo: M301 P16.25 I2.67 D24.68 echo: M304 P57.48 I1.36 D608.58 echo:Power-Loss Recovery: echo: M413 S1 echo:Stepper driver current: echo: M906 X1000 echo:Hybrid Threshold: echo: M913 TMC_RX ms_left:9 8, 5 255 110 22 0 130 132 213 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 5 255 108 22 0 130 132 213 Retry-2 CRC-OK X76 echo:Driver stepping mode: echo: TMC_RX ms_left:10 8, 5 255 0 0 0 0 192 141 Retry-1 CRC-OK M569 S1 X echo:Filament load/unload lengths: echo: M603 L0.00 U100.00 echo:Filament runout sensor: echo: M412 S1 echo:SD card ok Testing X connection... TMC_RX ms_left:9 8, 5 255 111 192 17 0 0 211 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 5 255 111 192 17 0 0 166 Retry-2 CRC-OK OK echo:SD card ok ```
gloomyandy commented 5 years ago

That error rate for the AVR seems very high. Is that using the standard Software Serial or your modified half duplex version? Though I suspect that both will have the same error rate (unless your code is very different in how it handles read operations?).

Good to see exactly what is going on though! There doesn't seem to be very much of a pattern to the errors that are occurring (at least I can't spot one). But I suspect that all it takes is for another interrupt to run just as the byte starts to cause problems. I suppose the good news is that this code is not used when the steppers are running (or at least I hope it isn't) as I suspect things are much worse then.

@doggyfan "interestingly, your CRC checking routine clears the frame as being valid if it's all zero's."

Yes I noticed that when I was adding the code to support half duplex. There is also nothing that ever checks for a CRC error.

There is also the interesting question of potential errors when writing to the TMC2208. If there is an error the TMC2208 will just ignore the frame (which if the command is important could be bad). There is a mechanism to allow the host to check for errors but it would need the host to read a register after each command (or at least after a series of commands).

I did wonder about adding retries. My main concern is that it potentially makes the polling of all the steppers take a relatively long time. But I'm not sure if that is a problem or not.

The current AVR Software Serial is tricky to try and improve. With the code on the LPC176x we can use a slower baud rate which makes it easier to correctly detect the RX frames even when other interrupts are active. This also spreads the RX interrupts out which is probably a good thing for the rest of the system. But with the AVR the read routine captures the entire 10 bits in the interrupt routine so using lower baud rates (which would probably reduce the errors) means that potentially other interrupts are blocked for a longer period of time (which is probably not good). Unfortunately I don't think the AVR is fast enough to allow it to run the same oversampling mechanism used for the LPC176x. Finally the Arm processors typically allow you to have multiple interrupt priorities, the AVR pretty much just the one.

ghost commented 5 years ago

I'm using the standard software serial routine not mine (I only made a tiny change anyway for single pin operation).

I changed the debug to display hex as that's easier to view problem bits ..

M122 output ``` 17: M122 X Enabled TMC_RX ms_left:9 8, 05 FF 06 20 00 01 41 1A Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 06 20 00 01 41 19 Retry-2 CRC-ERROR TMC_RX ms_left:-1 -1, 00 05 06 20 00 03 41 1A Retry-3 CRC-OK TMC_RX ms_left:10 8, 05 FF 03 20 00 03 40 1A Retry-4 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 06 20 00 03 41 1A Retry-5 CRC-OK false Set current 1000 RMS current TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 82 F5 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 81 84 D4 Retry-2 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6C 1B 00 81 82 D2 Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 81 82 D2 Retry-4 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 84 D5 Retry-5 CRC-OK 994 MAX current TMC_RX ms_left:10 8, 05 FF 6C 16 00 81 84 D5 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 81 82 D2 Retry-2 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6C 16 00 82 84 D5 Retry-3 CRC-OK 1402 Run current 17/31 Hold current 8/31 CS actual TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK 8/31 PWM scale TMC_RX ms_left:9 8, 05 FF 71 00 00 00 0A F6 Retry-1 CRC-OK 10 vsense TMC_RX ms_left:9 8, 05 FF 6E 17 00 81 82 D5 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6C 16 00 82 84 D5 Retry-2 CRC-OK 0=.325 stealthChop TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK true msteps TMC_RX ms_left:9 8, 05 FF 6C 16 00 81 82 F5 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6E 16 00 83 82 D2 Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 C2 F5 Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6E 0B 00 81 84 D5 Retry-4 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 13 00 81 82 F5 Retry-5 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 1B 00 83 84 D5 Retry-6 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6E 17 00 83 86 D5 Retry-7 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6C 16 00 82 84 D5 Retry-8 CRC-OK 4 tstep TMC_RX ms_left:-1 7, 00 05 FF 12 00 0F FF 1D Retry-1 CRC-OK TMC_RX ms_left:-1 -1, 00 05 12 00 0F FF FF 1D Retry-2 CRC-OK TMC_RX ms_left:10 8, 05 FF 12 00 0F FF FF 1D Retry-3 CRC-OK max pwm threshold 52 [mm/s] TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 84 D5 Retry-1 CRC-OK 76.02 OT prewarn TMC_RX ms_left:9 8, 05 FF 6F C0 04 00 00 5E Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 04 00 00 4F Retry-4 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 04 00 00 5E Retry-5 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-6 CRC-OK false off time TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 84 D5 Retry-1 CRC-OK 4 blank time TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 82 F5 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 84 D5 Retry-2 CRC-OK 24 hysteresis -end TMC_RX ms_left:9 8, 05 FF 6E 16 00 82 84 F5 Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6C 16 00 82 84 D5 Retry-2 CRC-OK 2 -start TMC_RX ms_left:9 8, 05 FF 6E 16 00 82 84 D5 Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6C 16 00 82 84 D5 Retry-2 CRC-OK 1 Stallguard thrs DRVSTATUS X stst TMC_RX ms_left:9 8, 05 FF 6F C0 84 00 00 5F Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-2 CRC-OK X olb TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-2 CRC-OK ola TMC_RX ms_left:-1 -1, 00 00 05 0B 04 00 00 5E Retry-1 CRC-OK TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5E Retry-2 CRC-OK s2gb TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK s2ga TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK otpw TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK ot TMC_RX ms_left:9 8, 05 FF 6F C0 0C 00 00 5E Retry-1 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5F Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 6F Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-4 CRC-OK 157C TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK 150C TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK 143C TMC_RX ms_left:8 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK 120C TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-1 CRC-OK s2vsa s2vsb Driver registers: TMC_RX ms_left:9 8, 05 FF 6F C0 84 00 00 5F Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-4 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-5 CRC-OK X 0xC0:08:00:00 Testing X connection... TMC_RX ms_left:9 8, 05 FF 6F C0 84 00 00 5E Retry-1 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 04 00 00 5E Retry-2 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-3 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5F Retry-4 CRC-ERROR TMC_RX ms_left:10 8, 05 FF 6F C0 08 00 00 5F Retry-5 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 80 00 5F Retry-6 CRC-ERROR TMC_RX ms_left:9 8, 05 FF 6F C0 08 00 00 5E Retry-7 CRC-OK OK ok ```
ghost commented 5 years ago

Something seems to have changed though because the serial comms to the TMC chips used to be mostly OK. Maybe a change to Marlins code at some point has made this problem worser ?

ghost commented 5 years ago

Yes I notice you're not checking the TMC reply when you write to the registers. I also see that after writing to the TMC register, it hangs around for 10ms doing nothing, it could be reading the reply during that time, and maybe re-doing it again if need be.

But the little retry code addition ought to be added really, you can set the number of retries to a smaller number than 10 (which I currently use for debugging).

I tend to think that configuring or checking the stepper config registers is kind of very important as the machine as a whole can't really function if even a single stepper is misconfigured.

gloomyandy commented 5 years ago

@doggyfan Just a note this code is not "my" code, I've made very little contribution to it, it is what it is and I'm sure has had many contributors in the past (all of whom have probably done their best to improve it). I have no more control over the code than you do (that not very envious job currently falls to @teemuatlut).

I'd also noticed the delay in the write, I have no idea why it is there or if for some reason it is required? @teemuatlut do you know if there is a reason for this? Is there any sort of issue with sending commands to the TMC2208 at a higher rate or anything?

gloomyandy commented 5 years ago

I've updated my git repo version of the code to include checking the register address field of the reply (so we now search for 24 bits of data). I've also removed the code for half/full duplex modes as it is no longer needed. I've tested it on the SKR V1.3 in full and half duplex modes and it seems fine.

@teemuatlut any thoughts on where you want to go with this? I'm happy to add in the retry code from above if we want to include that? Just let me know.

gloomyandy commented 5 years ago

@doggyfan this is the mechanism that can be used to ensure that write operations have been received correctly....

Each accepted write datagram becomes acknowledged by the receiver by incrementing an internal
cyclic datagram counter (8 bit). Reading out the datagram counter allows the master to check the
success of an initialization sequence or single write accesses. Read accesses do not modify the
counter.

As you mentioned adding such a test after every write in place of the current delay would not add much/any overhead (except I suppose in the case of retries). It uses register IFCNT at address 2. I'm not sure what happens if you reset the printer (rather than power cycling it), I suppose the startup code would need to read this (or perhaps just issue a write that "fails" as I assume if an error is detected we will need to set the current value of the shadow register to match that from the driver). It is read only so can't be reset (unless some sort of software reset command does that?).

teemuatlut commented 5 years ago

I very rarely see PRs against my work so it's actually not that many people. The delay at the end of write was more for safety but I don't think it's required.

I'll pull your repo and see how it works. I just got my SKR v1.3 a few days ago and I'll see how it works.

ghost commented 5 years ago

oops sorry, I thought the code was a lot of yours @gloomyandy, my bad :)

24-bit sync check now - cool !

Seems the SKR 1.3 board is becoming extremely popular.

So, we haven't really solved the AVR problem (though the retries help in my case a lot), but we have made the Rx frame reading/detection bomb proof now, it's now immune to spurious leading/trailing bits and bytes.

teemuatlut commented 5 years ago

When you guys prepared the driver boards, what did you do with the TX 1kOhm resistor? Does the SKR board have it pre installed?