Klipper3d / klipper

Klipper is a 3d-printer firmware
GNU General Public License v3.0
9.51k stars 5.32k forks source link

TMC5160 w/SKR v1.3 using 24v causes driver shutdown during print - timing issues? #2516

Closed Nandox7 closed 4 years ago

Nandox7 commented 4 years ago

Hey,

So I was running the TMC5160 over SPI on a SKR v1.3 for quite some time without any issue. Due to a heatbed upgrade I moved to a 24v PSU and problems started.

Using the exact same klipper settings and same exact gcode that printed well before. Few minutes into the print (1 or 2) one of the axis drivers shutdown in error, either Y or X.

Example from the DUMP_TMC command for the Y driver example. [After ERROR] 11:55:47.966: // ========== Write-only registers ========== 11:55:47.966: // COOLCONF: 00020000 sgt=2 11:55:47.966: // IHOLD_IRUN: 00060303 IHOLD=3 IRUN=3 IHOLDDELAY=6 11:55:47.967: // PWMCONF: c40d001e PWM_OFS=30 pwm_freq=1 pwm_autoscale=1 pwm_autograd=1 PWM_REG=4 PWM_LIM=12 11:55:47.967: // TPOWERDOWN: 0000000a TPOWERDOWN=10 11:55:47.967: // ========== Queried registers ========== 11:55:47.971: // GCONF: 00000004 en_pwm_mode=1 11:55:47.974: // CHOPCONF: 14410153 toff=3 hstrt=5 hend=2 tbl=2 tpfd=4 MRES=4(16usteps) intpol=1 11:55:47.980: // GSTAT: 00000007 reset=1(reset) drv_err=1(ErrorShutdown!) uv_cp=1(Undervoltage!) 11:55:47.986: // DRV_STATUS: 81036000 s2vsb=1 stealth=1 CSACTUAL=3 stallGuard=1 stst=1 11:55:47.991: // FACTORY_CONF: 0000000c FACTORY_CONF=12 11:55:47.995: // IOIN: 300000c0 SD_MODE=1 SWCOMP_IN=1 VERSION=0x30 11:55:47.999: // LOST_STEPS: 00000000 11:55:48.003: // MSCNT: 000000e8 MSCNT=232 11:55:48.007: // MSCURACT: 002300f4 CUR_A=244 CUR_B=35 11:55:48.011: // OTP_READ: 0000000c OTP_FCLKTRIM=12 11:55:48.015: // PWM_SCALE: 00000014 PWM_SCALE_SUM=20 11:55:48.019: // PWM_AUTO: 000d009f PWM_OFS_AUTO=159 PWM_GRAD_AUTO=13 11:55:48.023: // TSTEP: 000fffff TSTEP=1048575 11:55:48.023: ok

GSTAT: 00000007 reset=1(reset) drv_err=1(ErrorShutdown!) uv_cp=1(Undervoltage!)

I could assume it would be some hardware issue but after testing many things ended up installing marlin and it works fine with it. So it has to be a software (or driver configuration - more about it bellow).

I did noticed when setting Marlin that it includes settings for trinamic depending on the voltage so it leads me to conclude that either the drivers or the controller will work differently depending on the supply voltage. I'd expect that for the currents but check briefly scanning the code it also change other parameters.

/**

So it seems there are some timing issues that should be accounted for when using higher voltage supply, so far I did not find such settings in klipper.

alanbithell commented 4 years ago

I'm having the same issue with my 24v Rumba board, TMC5160 (Bigtreetech v1.2) and I cannot put more than 1A of run current into the steppers before they cut out. When I initialize them at 1.0A or less they work, but DUMP_TMC reports this

Recv: // ========== Write-only registers ==========
Recv: // COOLCONF:   00000000
Recv: // GLOBALSCALER: 00000054 GLOBALSCALER=84
Recv: // IHOLD_IRUN: 00061f05 IHOLD=5 IRUN=31 IHOLDDELAY=6
Recv: // PWMCONF:    c40d001e PWM_OFS=30 pwm_freq=1 pwm_autoscale=1 pwm_autograd=1 PWM_REG=4 PWM_LIM=12
Recv: // TPOWERDOWN: 0000000a TPOWERDOWN=10
Recv: // ========== Queried registers ==========
Recv: // GCONF:      00000000
Recv: // CHOPCONF:   14410153 toff=3 hstrt=5 hend=2 tbl=2 tpfd=4 MRES=4(16usteps) intpol=1
Recv: // GSTAT:      00000005 reset=1(reset) uv_cp=1(Undervoltage!)
Recv: // DRV_STATUS: 011f0000 CSACTUAL=31 stallGuard=1
Recv: // FACTORY_CONF: 0000000b FACTORY_CONF=11
Recv: // IOIN:       30000042 REFR_DIR=1 SD_MODE=1 VERSION=0x30
Recv: // LOST_STEPS: 00000000
Recv: // MSCNT:      000000fa MSCNT=250
Recv: // MSCURACT:   002100f5 CUR_A=245 CUR_B=33
Recv: // OTP_READ:   0000000b OTP_FCLKTRIM=11
Recv: // PWM_SCALE:  0000001d PWM_SCALE_SUM=29
Recv: // PWM_AUTO:   0000001d PWM_OFS_AUTO=29
Recv: // TSTEP:      00000832 TSTEP=2098

Why would Undervoltage be set?

After they stop moving I get

Recv: // ========== Write-only registers ==========
Recv: // COOLCONF:   00000000
Recv: // GLOBALSCALER: 00000054 GLOBALSCALER=84
Recv: // IHOLD_IRUN: 00061f05 IHOLD=5 IRUN=31 IHOLDDELAY=6
Recv: // PWMCONF:    c40d001e PWM_OFS=30 pwm_freq=1 pwm_autoscale=1 pwm_autograd=1 PWM_REG=4 PWM_LIM=12
Recv: // TPOWERDOWN: 0000000a TPOWERDOWN=10
Recv: // ========== Queried registers ==========
Recv: // GCONF:      00000000
Recv: // CHOPCONF:   14410153 toff=3 hstrt=5 hend=2 tbl=2 tpfd=4 MRES=4(16usteps) intpol=1
Recv: // GSTAT:      00000005 reset=1(reset) uv_cp=1(Undervoltage!)
Recv: // DRV_STATUS: e0050000 CSACTUAL=5 ola=1(OpenLoad_A!) olb=1(OpenLoad_B!) stst=1
Recv: // FACTORY_CONF: 0000000b FACTORY_CONF=11
Recv: // IOIN:       30000052 REFR_DIR=1 DRV_ENN=1 SD_MODE=1 VERSION=0x30
Recv: // LOST_STEPS: 00000000
Recv: // MSCNT:      00000218 MSCNT=536
Recv: // MSCURACT:   010b01db CUR_A=-37 CUR_B=-245
Recv: // OTP_READ:   0000000b OTP_FCLKTRIM=11
Recv: // PWM_SCALE:  00000005 PWM_SCALE_SUM=5
Recv: // PWM_AUTO:   0000001d PWM_OFS_AUTO=29
Recv: // TSTEP:      000fffff TSTEP=1048575
eehusky commented 4 years ago

The UV gets tripped when the charge pump starts getting out of regulation and can be caused by any number of fun little layout issues that can pop up when larger AC currents get involved. There are some notes in the data sheet regarding the layout. It could also be caused by the VMOT supply fluctuating if there isnt enough bulk capacitance on the base board.

For what its worth I am having the same issue with the same setup (but using an SKR Pro). I just ordered some of the official trinamic step sticks off digikey and if they have the same problem Im going to ask them about it. The schematics look the same but I cant find the artwork for the bigtree version to see how they laid everything out. They also havent uploaded schematics for their 1.2 revision yet.

I've gone through their [Trinamic] spreadsheet to calculate the nominal chopper values and copied the register values Marlin uses but can't seem to get past this. Guessing I will need to bust out the old OScope before all is said and done.

Im curious....If after startup you clear the reset and uv_cp flags does the uv_cp show up again when your stepper stops working? (The registers is a W1C) Having a UV showup on startup isnt necessarily a problem.

SET_TMC_FIELD STEPPER=stepper_x FIELD=reset VALUE=1 SET_TMC_FIELD STEPPER=stepper_x FIELD=uvcp VALUE=1 SET_TMC_FIELD STEPPER=stepper_y FIELD=reset VALUE=1 SET_TMC_FIELD STEPPER=stepper_y FIELD=uvcp VALUE=1

Should now report no flags in the GSTAT field until it is tripped again. DUMP_TMC STEPPER=stepper_x

jffmichi commented 4 years ago

Did anyone get the TMC5160s to work reliably? I'm currently trying to get the Bigthreetech v1.2 ones working on a small test board with an Arduino before I put them into anything more complicated and I'm starting to loose my mind.

For currents of up to about 1.0A or when in StealthChop mode they seem to work just fine during testing. I tested both a Nema 17 and a Nema 23 stepper motor. As soon as the driver switches into SpreadCycle (either due to not setting the en_pwm_mode bit or due to TPWMTHRS) when using more than 1.0A the problem described above starts to occur. Driver shuts down after a few seconds and uv_cp, ola and olb bits get set. When that happens I can't reset the driver in software. Neither setting ENN pin HIGH, TOFF=0 or trying to write clear the uv_cp flags work. I have to cut 24V power to the VM pin. I can however drop VM to 5V (schottky diode between VIO and VM) and then give back 24V to the driver and it starts working again for a short time. The problem does not seem to occur when running the motor at higher rpm, i.e. well above 120. When accelerating fast the driver mostly manages to reach that rpm region but not always. Also writing anything to DRV_CONF (even the default values) seems to make the problem worse and the driver doesn't even work with 1.0A or below any more. I double and triple checked that the values are correct.

I tried to figure out what's going on and did some measurements of the charge pump voltage during operation. As you can see in the screenshots the charge pump voltage drops a little when running at higher rpms and in SpreadCycle mode but this doesn't seem to be a problem for the driver. When running at lower rpm in SpreadCycle mode the charge pump voltage suddenly drops down to 24V and the described symptoms arise. However, I somehow doubt that the charge pump voltage dropping is the cause of the issue as the issue happens when the charge pump voltage seems to be at a reasonable value.

StealthChop (120rpm, working correctly): 01_stealthchop_120rpm SpreadCycle (120rpm, driver shuts down): 02_spreadcycle_120rpm SpreadCycle (480rpm, accelerate and decelerate, driver shuts down): 03_spreadcycle_480rpm (accelerate and decelerate) SpreadCycle (480rpm, fast accelerate and hold for a while, working correctly but sometimes shuts down during acceleration/deceleration): 04_spreadcycle_480rpm (fast accelerate and hold)

Also no other settings seem to really influence the described problem. I tested a lot of different values for the registers that seem to make sense, e.g. different chopper config timings from the Marlin firmware or default values from the datasheet. I also tried disabling the short-to-ground and short-to-vs detection and quite a few other things but nothing worked so far... I'm not sure if it's even a software or setting issue at this point or if the drivers from Bigtreetech are just bad in this case...

@eehusky: did you test the official step stick drivers and did it make a difference?

langwadt commented 4 years ago

might be obvious but, is the clock pin connect to ground?

jffmichi commented 4 years ago

@langwadt: unfortunately that doesn't seem to be the issue. I have it manually tied to ground and also it should be tied to ground through a resistor according to some text from Bigtreetech I remember to have read a few days ago. I measured 10kOhm to ground so seems to be correct. For me it didn't make a difference what I do with it anyway: I tried leaving it open or even pulling it to vio.

I also tried three TMC5160 v1.2 from Bigtreetech and all show the same symptoms so as far as I can tell it doesn't seem to be a single bad part.

alanbithell commented 4 years ago

To get around the issue I ended up setting run current to 1.0A in the config and then in my gcode start script I put SET_TMC_CURRENT STEPPER=stepper_x CURRENT=2.0 for each axis. The only time the steppers have tripped since is when I left my board cooling fan off and they must have overheated.

eehusky commented 4 years ago

@jffmichi Thanks for the scope shots, give me something to compare too. The drivers are sitting on my desk but haven't had an opportunity to swap them out yet. I should be able to do it this weekend though. I did read this the other day on the reprap wiki and it could be whats causing our problems. https://reprap.org/wiki/StepStick#Repair_Attempts

If they ran these through an oven profiled for a 4 layer PWB assembly you can definitely run into problems when ICs have exposed pads under the chip with a lead free process...you need to let them soak a bit longer. (It looks like they moved the 5160 step sticks to a six layer board)

@alanbithell One thing i did notice when poking around in the tmc5160.py file is that the SET_CURRENT gcode doesnt appear to be changing the global scalar so I'm not sure you are actually running at 2.0A? I think that is actually a bug in the klippy software but isnt related to this problem.

alanbithell commented 4 years ago

@eehusky I noticed the register wasn't changing too, but setting current to 0.2A definitely reduced the current because my Motors started skipping like crazy. So I'm not sure what's going on there, Ill have to dig around and see what its actually doing.

eehusky commented 4 years ago

@alanbithell Thats correct, you can decrease the current settings but you can't increase them. The 5160 has an extra layer of current control. Setting GS (Global Scalar) to some value, lets say 84, and IRUN to 31 and IHOLD to 15 will drive 1000ma when active and 500ma when idle.

With the above values if you call SET_CURRENT with a value larger than what was initially configured there should be zero change in register settings or behavior. But if you call SET_CURRENT with a smaller value than whats stored in the configuration it will decrease the IRUN and IHOLD values with the same GS value.

The gist is for any desired current settings you want to maximize the IRUN and GS values while staying within some other boundary conditions. Below is a sample of what I came up with when i was sorting the math out myself. (This still needs a loop to decrease the initial CS value until a valid GS and CS value are reached.) Note that CS (Current Select) refers to either IRUN or IHOLD depending on which state the IC is currently in.

import math
RSense = 0.075
Vfs = 0.325
CS=31
# Absolute maximum current supported (determined by RSense Value)
# Formula from page 74 of TMC5160/TMC5160A DATASHEET (Rev. 1.13 / 2019-NOV-19)
Imax = (256/256) * ((CS+1)/32) *(Vfs/RSense)*(1/math.sqrt(2))
#Imax = (Vfs/RSense)*(1/math.sqrt(2))

# Requested Running Current Value
IRun = 1.0
# Requested Holding Current Value
IHold = 0.5

# Register Value for IRUN
IRUN = CS

# Register Value for IHOLD
# This is the percentage of IHold to IRun scaled to 0-31.  If Ihold == IRun then the values are the same.  This is also referred to CS throughout the documentation.
IHOLD = int((IHold/IRun)*(IRUN+1)) - 1

# Register Value for GLOBALSCALE
GLOBALSCALE = int((IRun / Imax) * 256 +0.5)
if GLOBALSCALE == 256:
    GLOBALSCALE = 0
elif GLOBALSCALE <= 31:
    raise Exception("uhoh: "+GLOBALSCALE)

if GLOBALSCALE <= 128:
    print("Sub optimal, but valid, value selected for GlobalScaler...Greater than 128 recommended, although there are no remarks as to why its better")

if IRUN <= 16:
    print("Sub optimal, but valid, value selected for IRUN...Greater than 16 recommended")

print(GLOBALSCALE)
print(IRUN)
print(IHOLD)

Output:

Sub optimal, but valid, value selected for GlobalScaler...Greater than 128 recommended
84
31
15

Just trying to save you some time, I already banged my head against this particular wall :)

jffmichi commented 4 years ago

@eehusky reminds me of the old baking graphics card in the oven thing. Definitely sounds like a possibility and the described symptoms also match.

The diagnosis is, they simply don't move the stepper, only at very low currents or only a second after turning them on (while they're cold).

However, if the drivers are indeed bad, fortunately I can still return them and buy from watterott for a few Euros more. Please let us know as soon as you find the time to try the new ones. Thanks in advance :)

alanbithell commented 4 years ago

I found this and it seems to have fixed my issue, I can run flat out at 3A now, not that I need 3A but its good to know.

Nandox7 commented 4 years ago

Interesting, May try those params, in overrall my X and Y run quite cool but the double Z leave the steppers super hot. And I'm using 0.5 and 0.3 as run and hold current, in those params it has 1 and 2.

I imagine is down to the value of hold_current? I mean in a cartesian printer that is the main different the Z will keep the movement locked for much longer than any of the other axis.

SK-StYleZ commented 4 years ago

I had same issues with the TMC5160 (bigtreetech) & SKR v1.3 like @alanbithell. I couldn't push the 5160's beyond 35W (24V 1.5A & 35V 1A) - the stepper just stopped after few mm of motion.

Before the test runs i've resetted the "reset" and "uv_cp" flags like @eehusky mentioned - thanks for this hint!

SET_TMC_FIELD STEPPER=stepper_x FIELD=reset VALUE=1
SET_TMC_FIELD STEPPER=stepper_x FIELD=uv_cp VALUE=1
SET_TMC_FIELD STEPPER=stepper_y FIELD=reset VALUE=1
SET_TMC_FIELD STEPPER=stepper_y FIELD=uv_cp VALUE=1
SET_TMC_FIELD STEPPER=stepper_z FIELD=reset VALUE=1
SET_TMC_FIELD STEPPER=stepper_z FIELD=uv_cp VALUE=1
SET_TMC_FIELD STEPPER=extruder FIELD=reset VALUE=1
SET_TMC_FIELD STEPPER=extruder FIELD=uv_cp VALUE=1

Finally my machine is running on 35V@3A, this settings did the trick:

# from reddit
spi_speed: 1000000
driver_IHOLDDELAY: 6
driver_TPOWERDOWN: 10
driver_TBL: 2
driver_tpfd: 0
driver_pwm_autoscale: True
driver_pwm_autograd: True
driver_pwm_freq: 2
driver_PWM_GRAD: 0
driver_PWM_OFS: 0
driver_PWM_REG: 0
driver_PWM_LIM: 0

# 36V settings for TMC5160 taken from marlin
driver_TOFF: 5
driver_HEND: 5
driver_HSTRT: 3

Thanks a lot @alanbithell for the reddit link !!!

Nandox7 commented 4 years ago

What type of machine do you have? I had to drop run_current and hold_current down to 0.300 (running on 12v) or else the Z steppers would be so hot I couldn't touch them.

SK-StYleZ commented 4 years ago

It's an custom corexy with nema 23 motors - they do get warm but @ 3A (still can touch them). My smaller machine is running on TMC2130 24V @ 1.2A run_current & 0.75A hold_current (z axis) without issues - the two motors are wired in series (i don't remember why i wired them in series).

Previously the 5160's were mounted on the same small machine without problems - running on 24V
@ max 1A => small machine no high torque need.

Did you set RSense to 0.075 Ohm? => sense_resistor = 0.075

komandrik commented 4 years ago

Did you set RSense to 0.075 Ohm? => sense_resistor = 0.075

this value is set by default in the configuration. Why write it again?

Nandox7 commented 4 years ago

Ok, thanks with NEMA23 makes more sense. I believe sense_resistor = 0.075 is the default value still at some point I've set it as well.

This got me curious and did some testing. Using the settings bellow: run_current: 0.500 hold_current: 0.400

Before: 12vPSU (+24v PSU for bed) > 60deg After: 24v PSU for all > 31deg Celsius

Later even raised the run_current to 0.600 without a problem. So feeding the drivers less voltage was making them got much higher in temp.

jffmichi commented 4 years ago

@SK-StYleZ thank you. As far as I can tell, setting tpfd=0 solved all problems with SpreadCycle. Of course some fine-tuning of the remaining parameters is still necessary but the driver doesn't shut down with uv_cp=1 any more. Writing to DRV_CONF still causes the described issues when operating in StealthChop mode but not when in SpreadCycle mode. Still weird... But when I don't write to DRV_CONF and set tpfd=0, it seems the driver is usable...

alanbithell commented 4 years ago

I spent hours last night going through every setting difference and came to the same conclusion. TPFD is the culprit

SK-StYleZ commented 4 years ago

@jffmichi thanks for digging deeper! I've tested StealthChop too and i do get driver error (not "uv_cp=1" or "ola/olb") while SpreadCycle does a great job. Going to do some more testing.

Is there a guide how to fine-tune?

robthide37 commented 4 years ago

i made another issue about the default timing being backwards. I think their is alot of misinterpretation if you lookup spreadcycle an001 on the trinamic website it has a basic spreadcycle tuning guideline and i believe the marlin deaults are not adding anything but simply letting us know what the driver does to the values internally. Also hstrt cannot exceed 3 and th hend can range from 1 to 15 but that means that we only apply 0 to 14 because of the offset. long story short default should be around 2 hstrt and 5 hend and then you check for low speed movement and keep increasing hend until no more improvement . toff you increase until you hear the buzz from the pwm and then reduce one or two below to increase switching frequency above 15khz but keep lower than 40khz. Lastly total blank time needs to be based off of clock speed so 16mhz should be 1 or 2 they state going to low will mess the sine wave up so i don't recommend 1 unless have a scope to check. Also if still improving and at 15 total hysterisis then you can reduce toff 1 to increase frequency and start over with hend. Best to read the guide cause i am sure i just made a confusing mess out of the instructions. I will post link later its in the issue i posted about default values being wrong for tmc steppers if anyone needs asap