Closed VanessaE closed 3 years ago
This is the normal state. But, it also keeps TMC drivers from working at all.
This is not true.
I have several TMC-based SKR2 builds here that all work fine with DISABLE_DRIVER_SAFE_POWER_PROTECT
commented out/disabled so the feature is enabled. I updated one on my bench to 682d6c9 an confirmed that it still works correctly with the latest bugfix-2.0.x
code.
Have you tried replacing the board? Based on https://github.com/MarlinFirmware/Marlin/issues/22691, I suspect you may have damaged more than a single component.
I already talked to someone at BTT, they insist that the board is fine.
I already talked to someone at BTT, they insist that the board is fine.
I'd order a replacement. There is something not working correctly on your board.
But that doesn't make any sense - clearly Marlin's turning that MOSFET on and off. I can print, as long as that option is uncommented. Besides, this is a brand new board, only a couple of days old.
Besides, this is a brand new board, only a couple of days old.
Doesn't matter that a board is new out of the box. It's not working correctly. I suggest ordering a replacement.
I will do so, but let's leave this issue open.
I just updated my B1 SE & SE Plus (both running SKR2s) and this feature is working correctly.
I'm sure an exchange will get you sorted. 👍
There has been a bunch of tmc stepper driver of late that where not cleaned properly. The flux residue causes all sort of problems with them. Are your tmc nice clean and and flux free?
Please see also, https://github.com/bigtreetech/SKR-2/issues/63#issuecomment-915323480
@thisiskeithb as can be seen in that comment, exchanging for a new board did not fix my issue. The new hardware behaves identically to what it replaced.
Can you please confirm if you're using 2208's in UART mode on those SKR 2's?
I stand by my theory that the protection feature is causing drivers to ground themselves through some unintended internal path other than their primary ground rails.
@ellensp My drivers were and are clean, or certainly clean enough that I'd expect them to work normally.
Can you please confirm if you're using 2208's in UART mode on those SKR 2's?
I've used TMC2225s, TMC2208s, TMC2226s, and TMC2209s - all in UART mode and they work fine.
@thisiskeithb All I know is this thing just killed three more brand spanking new 2208's, on a brand new SKR 2 (i.e. the replacement), in exactly the same manner as on the first SKR board. That makes 8 dead driver modules. Fortunately they're cheap enough to replace.
Can you please try testing 01d1192a with my config (Marlin-configs-2021-09-08.zip) and at least 3 or 4 2208's?
@thisiskeithb All I know is this thing just killed three more brand spanking new 2208's, on a brand new SKR 2 (i.e. the replacement), in exactly the same manner as on the first SKR board. That makes 8 dead driver modules. Fortunately they're cheap enough to replace.
Can you please try testing 01d1192 with my config (Marlin-configs-2021-09-08.zip) and at least 3 or 4 2208's?
There must be something unique about your setup. I'm running SKR2 boards and have not had this issue. Remember that the high side and the low side are both switched so the only possible source would be the UART line and that would not likely have the power to cause any damage if the current tried to pass through the IC via an alternate path. It would likely just act as a high impedance with no power applied to the rest of the IC. I'm running SKR2s and have not had this issue at all.
@thisiskeithb All I know is this thing just killed three more brand spanking new 2208's, on a brand new SKR 2 (i.e. the replacement), in exactly the same manner as on the first SKR board. That makes 8 dead driver modules. Fortunately they're cheap enough to replace.
Can you please try testing 01d1192 with my config (Marlin-configs-2021-09-08.zip) and at least 3 or 4 2208's?
Maybe you could try something to remove as many variables as possible. Power the board outside of any machine with driver(s) inserted and a clean PSU. Check what happens on the bench and then move from there.
In any event, this isn't a Marlin bug.
This Issue Queue is for Marlin bug reports and development-related issues, and we prefer not to handle user-support questions here. (As noted on this page.) For best results getting help with configuration and troubleshooting, please use the following resources:
After seeking help from the community, if the consensus points to a bug in Marlin, then you should post a bug report.
Sucks to hear that!
Some possible causes things I'm thinking of:
@looxonline
There must be something unique about your setup
That was my first thought before I started yelling in here, but with today's tests/failures, we're talking about using only a power supply (genuine Mean Well SE-600-24), a handful of run-of-the-mill TMC2208 driver modules, and a 12864 display module.
I even avoided plugging-in the USB cable initially (only once it came time to look at M122
).
I don't see how can it get any more generic than that.
I'm running SKR2 boards and have not had this issue.
But did you try the same combo of branch/commit, drivers, and motherboard version, with a build using my config files?
I'll share my copy of the source tree if you like, but I take no responsibility if my firmware.bin causes your board to fry a driver. :stuck_out_tongue:
Check what happens on the bench and then move from there.
That's effectively what I did.
Since the only 24v supply I have available to use for this is the one in the printer, the printer was the "bench" during these tests.
I'm not sure it matters, though, because at the end of that first round of testing, before I sent the first SKR v2 back, I reconfigured to replace the two definitely-dead 2208's with a couple of old A4988's, just to see what would happen. I figured if one of those dies, whoop-de-doo, I have several more.
That got the machine working enough to print a Benchy as a sanity check (but the remaining 2208's were on X/Y and were damaged as well, so the quality was mediocre).
Note that I was especially careful to avoid even the smallest change to the rest of the setup when I swapped those A4988's in. Fewest variables, and all that.
Remember that the high side and the low side are both switched so the only possible source would be the UART line and that would not likely have the power to cause any damage
One wouldn't think so, but here we are. Those three drivers crapped-out today without my even trying to move their respective axes. All I had to do was plug them in, turn on the power, and wait. And yes, I plugged them in properly.
@thisiskeithb
In any event, this isn't a Marlin bug
Either Marlin is misusing the SKR v2's hardware in such a way that even a Rev. B will burn-up TMC drivers under certain conditions, or the board has a major design flaw.
In either case, since the motherboard's design is basically set in stone, isn't it the job of the firmware to work-around the motherboard's flaws, or to at least throw errors at compile-time if there's a risky combo of settings?
Besides, since it could print once I swapped in those A4988's as mentioned above, that proves beyond any doubt that everything else about the machine works fine, and that I did not make any mistakes in the rest of the hardware setup (certainly nothing that should lead to a critical failure, otherwise I would think those A4988's would have burned-out too).
Also, didn't anyone look at my configs to double-check my work? If my configs are reasonable, and two brand new sets of hardware failed in the same ways, how can I point to anything BUT Marlin here?
@gzalo
stepper_driver_safety writes to some pins (as outputs) when the supply and ground to the steppers is effectively disconnected. Maybe some stray current is killing the digital part of the TMCs?
That's my theory as well.
the SAFE_POWER_PIN connects both the ground and vmotor using different methods. Maybe it has some us of slight delay, and thus Marlin could end up trying to send commands before everything has settled?
That's possible. I brought up a similar idea earlier.
are the drivers robust enough to be powered on without connecting them to the motors?
I would think so, but since I never tried to move any axes for this round of tests, the drivers shouldn't turn on yet anyway, and besides that, when I was working with the first board/drivers, I had the motors connected then.
I remember I burned a few TMC drivers when using a ramps and it supplied logic voltage without the motor power. Maybe something similar happens when connecting the board through USB?
Not possible with the SKR boards -- they have a jumper that feeds 5v/3.3v either from USB or 12/24v in, but never both at the same time. Either way, I never had that kind of thing happen on my old 2560/RAMPS, nor on SKR v1.1, v1.3, or v1.4. Sure, boards fail and whatnot, but I don't recall having ever burned out even one driver module since I started back in 2016, let alone this.
@thisiskeithb I'm sure you mean well, but please... stop with that "this is not a help forum" boilerplate. That kind of reply is NOT helpful at all.
I get it, this is a bug tracker (among other things), not a general forum, but that kind of reply is barely more than a "you're wrong, now go away" sort of response, and it assumes that you can't be wrong, that no one else "in the know" has an a solution, and that I didn't do my due diligence.
In other words, it's written for n00bs, which I absolutely am not.
Plus, since when has it ever been proper to look for consensus outside of a bug tracker?
I'm wrong a lot, I'll admit, but to give that sort of reply after I wasted a bunch of my time and (less-so) money trying to make this work... to put it politely, I'm less than pleased.
Either Marlin is misusing the SKR v2's hardware in such a way that even a Rev. B will burn-up TMC drivers under certain conditions, or the board has a major design flaw.
I don’t know if you just had bad luck or have a very specific combination of firmware settings and physical hardware/wiring/etc., but this if this feature did not work as intended, then there would be a lot more complaints in regards to TMC2225s/2208s on the SKR2 Rev B. since that combination is used in the Biqu B1 SE and SE Plus and not to mention all of the boards currently in use with TMC drivers.
Like I mentioned above, there are other places to seek help with your hardware issues.
very specific combination of firmware settings and physical hardware/wiring/etc
Again, it's just a power supply, drivers, and the display module. Two thick DC wires for the power and two ribbons for the display. That's it, unless you're counting the mains power cord and/or the USB cord.
And I shared my firmware settings. Don't want to look? Fine, at least let someone else have a crack at it, rather than brushing me off.
[...] there would be a lot more complaints in regards to TMC2225s/2208s on the SKR2 Rev B. [...]
Except there have been some complaints, for example:
https://github.com/bigtreetech/SKR-2/issues/63
https://www.reddit.com/r/MarlinFirmware/comments/ontmdo/tmc_connection_error_skr_2_tmc_2209/
https://www.reddit.com/r/BIGTREETECH/comments/o5hnr3/btt_skr_2_rev_b_tmc_2209_driver_backwards/
since that combination is used in [...]
And how many of them have the protection feature enabled, and are running in UART mode?
Like I mentioned above, there are other places to seek help with your hardware issues.
Again with the dismissive attitude.
What's to help? Marlin's doing something wrong here, and I have spelled it out in every possible way. I didn't invent the SKR v2 so it's not like I can fix the design.
It works with dusty old A4988's that I've had since the last ice age, but not with brand new TMC2208's that have barely dried out from the fab. How is this "my hardware issue", and not something Marlin is or isn't doing?
...and, I just re-checked:
if I uncomment DISABLE_DRIVER_SAFE_POWER_PROTECT
and compile, I can put my last two 2208's (which weren't killed in the last rounds) on the board, and they seem to be willing to work - motion and UART seem right, spew from M122
is quick and the drivers both report good, and they only get a bit warm when driving their respective motors.
That is, their behavior seems to be completely normal with that line uncommented.
Marlin appears to be calling upon the TMCstepper code until before it does its anti-SNAFU checks and has turned the power control circuit on.
I suggest enabling MARLIN_DEV_MODE
for some additional log output during setup()
so we can see the exact point in the startup procedure where certain things are done, and then perhaps you can try moving this block to different points within setup()
to see if it makes any difference:
#if PIN_EXISTS(SAFE_POWER)
#if HAS_DRIVER_SAFE_POWER_PROTECT
SETUP_RUN(stepper_driver_backward_check());
#else
SETUP_LOG("SAFE_POWER");
OUT_WRITE(SAFE_POWER_PIN, HIGH);
#endif
#endif
I've examined your configs to see if anything stands out, and nothing seems problematic. I was curious about all three serial ports being in use, and wondering if any of those could be stepping on the TMC UART. You might try disabling EEPROM_SETTINGS
, MONITOR_DRIVER_STATUS
, TMC_DEBUG
, STEALTHCHOP_*
, and HYBRID_THRESHOLD
options as part of testing to see if any of those could be involved.
Of course, be careful with all these things in testing.
Have you checked the current from your PSU to make sure it is stable and correct? I'm sure the board can handle more than 24V but it can't hurt to double-check.
Also, try enabling PINS_DEBUGGING
and running M43
to make sure we don't have any odd pin conflicts.
The important thing is to narrow this down and determine the most direct cause. I don't see anything that looks potentially harmful in stepper_driver_backward_check
itself, although it does leave the *_ENABLE_PIN
s in INPUT
state briefly, just until stepper.init()
is called. I don't know if that could cause any trouble. The settings are loaded from EEPROM / defaults after the backward-check but before stepper.init()
so it would be good to make sure nothing in settings.load()
could mess up TMC drivers either.
We can continue to troubleshoot over on Discord until we have a fix on the exact cause of your troubles.
perhaps you can try moving this block to different points within
setup()
I'd try that and the other tests you mentioned, but I'm kinda low on TMC2208's, and this issue burns them out quickly (seconds to minutes), if it's allowed to happen.
I've examined your configs to see if anything stands out, and nothing seems problematic.
That's a relief.
I was curious about all three serial ports being in use, and wondering if any of those could be stepping on the TMC UART
I found that in BTT's default config, but I didn't put it into place until after the first drivers died. It was only for testing, but to be safe, I've disabled the third port.
I doubt there's connection there though, since TMC2208 UART is bit-banged over GPIO, one pin per driver slot, rather than using a µC-managed serial bus like 2130's do.
I suggest enabling
MARLIN_DEV_MODE
Also, try enablingPINS_DEBUGGING
and runningM43
Both are now enabled. The result of the latter is:
Nothing jumps out at me.
Have you checked the current from your PSU to make sure it is stable and correct? I'm sure the board can handle more than 24V but it can't hurt to double-check.
While I'd need a clamp-on ammeter to check the current properly (my regular meter wouldn't handle the load), it does seem to be fine on the surface. Voltage is a steady 24v when under load -- last night I heated up the bed and hotend without trouble just for a test. Between those and the lighting, at the very least the PSU is solid at 400 or so watts (which is probably the most the machine can actually pull).
although it does leave the
*_ENABLE_PIN
s inINPUT
state briefly, just untilstepper.init()
is called
I thought about this last night actually. While I struggled to understand the code (I don't know Marlin's codebase at all, and C/C++ is not exactly my forte), it is at the very least leaving all four used driver slots in that state before stepper.init()
comes back around. I wonder if that isn't creating a current leak that gets exploited by the UART code when it goes to twiddle its respective lines?
I'd test, but again, low on 2208's.
it would be good to make sure nothing in
settings.load()
could mess up TMC drivers
If it helps, before putting drivers on the new board, I made a point to do M502
, M500
just to make sure the EEPROM was clean (just in case manufacturer testing left something behind).
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Did you test the latest
bugfix-2.0.x
code?Yes, and the problem still exists.
Bug Description
In
Configuration_adv.h
, one finds this section:This is the normal state. But, it also keeps TMC drivers from working at all.
At first I couldn't figure out why, then I looked at the SKR 2 schematic today (my board is a Rev. B and matches it):
MOT_POWER
is the control line from the µC,PGND
is the power supply ground rail.Clearly, the driver power control MOSFET Q1 cuts the ground connection to the motor drivers when it's turned off. I mean, ALL of the driver modules' ground circuits, the lot of them, and not just whatever's needed for motor power (though I'm not sure if a TMC chip "splits" its grounding in this fashion, not that it matters)
No ground to the driver means unreliable UART comms, which means TMC connection errors (UART generally requires both +V and ground) or gibberish data being fed to the drivers.
It also means trying to turn a motor on can start to BURN UP the chip! I'm guessing here that the motor drive circuitry is trying to pull a ground reference through the rest of the driver chip from somewhere else inside it other than its primary ground rail. I am 100% certain that this is what killed three of my TMC2208 drivers.
In my not so humble opinion, this is a major hardware design flaw. However, that's out of Marlin's hands.
If I uncomment that line, the drivers work, UART works, etc., presumably because the ground connection is turned on by default in this case. However, my TMC2208's then can't be disabled - idle timeout and M84 do nothing on them (my two A4988's do turn off, however). I'm not sure why that is, but Marlin's doing it, as they start out turned off, and only turn on when Marlin finishes its boot logo animation.
My proposal:
Marlin appears to be calling upon the TMCstepper code until before it does its anti-SNAFU checks and has turned the power control circuit on.
So, don't try to query the drivers, and definitely do not allow any STEP/DIR/EN output to a driver that doesn't report all good, until after the anti-SNAFU code is happy and the Q1 power control circuit has been turned on, and thus chip has a valid ground reference.
If that means that some user can't print anymore because some other issue makes their drivers throw UART errors (that they've perhaps been ignoring), then so be it. They need to fix their hardware. If they're in a hurry, they could of course put those axes into A4988 or standalone mode.
Bug Timeline
Unknown timeline, but new to me since I'm working with new hardware.
Expected behavior
As described above.
Actual behavior
Bad UART comms causing TMC connection errors, high risk of driver burn-out.
Steps to Reproduce
No response
Version of Marlin Firmware
bugfix-2.0.x
branch, commit 01d1192aPrinter model
Custom-built Hypercube
Electronics
brand new SKR v2, TMC2208's; no other electronics of significance
Add-ons
No add-ons of significance.
Bed Leveling
ABL Bilinear mesh
Your Slicer
Slic3r
Host Software
Pronterface
Additional information & file uploads
No response