Closed thespooniest closed 2 years ago
Using the same firmware build, does recompiling with Input Shaping disabled fix the issue?
Yes, it does.
On Tue, Oct 25, 2022, 1:45 AM Keith Bennett @.***> wrote:
Using the same firmware build, does recompiling with Input Shaping disabled fix the issue?
— Reply to this email directly, view it on GitHub https://github.com/MarlinFirmware/Marlin/issues/24927#issuecomment-1290017733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANCHFOL2A5F4AOHP3TAWZDWE5XYRANCNFSM6AAAAAARNTF3XY . You are receiving this because you authored the thread.Message ID: @.***>
Hi, I had this problem and it was related to the lack of RAM. The end of compilation showed me 97% usage and it was not enough. Now it's up to 85%. Maybe a way forward and look carefully at the conditions regarding the input shaping #24797.
Already checked that from provided configs
Checking size .pio/build/STM32G0B1RE_btt/firmware.elf
Advanced Memory Usage is available via "PlatformIO Home > Project Inspect"
RAM: [== ] 15.2% (used 22420 bytes from 147456 bytes)
Flash: [===== ] 52.6% (used 275544 bytes from 524288 bytes)
I have same board but also have about 10 hours of prints with input shaping enabled and no resets seen. I do see the homing grinding into the edge when it starts close to edge or if I abort a print using menu but I see that with input shaping not compiled in as well and its not 100% reproducible.
The main difference I saw in configs is I do not have UBL enabled so it might help to isolate if you test with bilinear enabled.
Also, just an FYI. I noticed you had a typo in BLTOUCH_HS_MODE define so not sure how that is being treated during compile.
I do see the homing grinding into the edge when it starts close to edge or if I abort a print using menu but I see that with input shaping not compiled in as well and its not 100% reproducible.
@cbagwell and @thespooniest are you both using sensorless homing?
@thespooniest Is homing speed the same with and without IS enabled? And when the firmware resets, do you get a message saying "homing failed"?
@thespooniest What default_env do you have in platform.ini? And once we have the answer, @cbagwell do you have the same?
I am not using sensorless homing. As part of this process I did discover that I'd forgotten to remove the DIAG jumper on my mainboard (as you're supposed to do with the SKR Mini e3 v3), but removing it does not seem to have fixed the problem: the motor still grinds if homing starts too close to zero, and the board still resets after a second or two either way.
Homing speed seems to be the same with and without IS enabled, at least on X and Y. I don't know about Z, because the firmware resets before reaching that part. I don't get any message, either onscreen or through OctoPrint: the firmware just resets.
My default_env is default_envs = STM32G0B1RE_btt
I'm not using sensorless homing as well and also default_envs = STM32G0B1RE_btt.
My wild guess is if I disabled QUICK_HOME then it might resolve the homing issue (but I haven't taken time to test that idea yet). That's because It moves diagonal until it hits the X end stop then Y movement stops while X starts grinding. I haven't reviewed the code yet but I assume that means it has entered some state where its should be backing back up but is confused and maybe its that concurrent Y movement that is causing the confusion.
I see the reference BTT Mini E3 V3 config has QUICK_HOME commented out but I carried it over from my CrealityV422 config file. Maybe its commented out for a reason?
In my case, I eventually get a homing failed message but I do not see a reset and have to power cycle manually to recover.
But mine of course is just an inconvenience since an average home does work and I can use IS.
I tried disabling QUICK_HOME. It didn't solve the reset issue.
The motor grinding changes, but isn't fixed. If I start far from the zero point, the board resets a second or so after the homing starts, even though it hasn't hit the zero point, just like before. If I start a medium distance from the zero point, the head eventually hits zero and begins to grind, but the board resets a second or two after homing starts, again like before.
But if the head starts very close to zero, it will grind until it generates a "Homing Failed - Please Reset" error. This halts the printer, which prevents the automatic reset that the other situations have.
Possible clue. As I attempted to change the nozzle, I told the printer to move the extruder -10mm. This also caused a spontaneous reset. Perhaps it has something to do with input shaping while moving an axis into the negatives?
(And I also broke my thermistor, so I may not be able to do tests for myself for a couple days. New one arrives Thursday ).
I have a SKR Mini v3 coming my way, so will try to replicate.
But for now @cbagwell can you build with @thespooniest's exact config and see if it replicates for you?
I was able to reproduce the reboot on first try with attached config files. Still the biggest difference between my config is UBL is enabled so that would be my next guess to toggle off.
I'll try to do more testing in a few hours or at least upload a diff of my working config.
My printer ended up being used for prints so I wasn't able to debug... so I'll attach diff between my no-reset config and the reset config. In addition to UBL, here are some items that jumped out at me that are X/Y related:
Okay, here's a hypothesis: ADAPTIVE_STEP_SMOOTHING causes the stepper ISR to be called so often that when you add extra ISR calls for input shaping, the CPU spends all its time in ISRs. Consequently endstop logic is not processed and the watchdog timer never gets reset, resulting in a reboot.
What happens if you disable USE_WATCHDOG
in Configuration_adv.h? (Just temporarily for this test - be sure to switch it back on later!) If I am right you'll still get the grinding but not the reboot.
It will probably be tomorrow before I have access to my printer to test but I wanted to add some support to the hypothesis.
ADAPTIVE_STEP_SMOOTHING was recently disabled in config files for this specific board because that appeared to solve some cases with BLTOUCH failing. Others have found another solution to allow ADAPTIVE_STEP_SMOOTHING with BLTOUCH by increase interrupt priority of PWM timer (#24922 ) while others needed to disable both ADAPTIVE_STEP_SMOOTHING and ENDSTOP_INTERRUPTS_FEATURE (mentioned in #24351 ) for BLTOUCH stability.
The common theme in those solutions are to reduce or rearrange interrupts to solve bltouch issues so the board is likely already struggling with the interrupts it has and adding INPUT_SHAPING on top of ADAPTIVE_STEP_SMOOTH may be pushing it into watchdog territory.
I bought a MAX7219 at same time as this board specifically to use with your profiling code with it but it's currently collecting dust in a corner. That would likely be useful info in this case so I'll try to make that my next project.
Oh a MAX7219 would really help. I am glad you have one.
I was able to run some more tests and you where right.
So now its clear that enabling both ADAPTIVE_STEP_SMOOTH and INPUT_SHAPING causes the board to spend to much time in interrupts.
I think I may have also gotten some clues that my home grinding even when I have both ADAPTIVE_STEP_SMOOTH and INPUT_SHAPING disabled could be from lost interrupts. Perhaps there is a burst of interrupts right when you initiate a home sequence and could explain why it often has issues when you start the home really close to zero.
Excellent. Now with both INPUT_SHAPING and ADAPTIVE_STEP_SMOOTING, is the problem resolved by adding
TERN_(INPUT_SHAPING, max_rate <<= 1);
That did indeed allow me to home X/Y without rebooting! Sample size of 1 though.
It halves the ASS ISR rate when IS is enabled. I think we may have solved this one.
No sooner did I replace the thermistor than I broke the cable for my BLTouch. But I was able to confirm that the motors don't grind and the firmware doesn't reset, so I think this is fixed.
I'm chasing down some other things at present but in due course I'll get this or a similar fix into my is_improvements branch.
Thanks for reporting it and helping find the bug.
I think there is probably another bug related to ADAPTIVE_STEP_SMOOTHING: the replay buffer is too small if ASS actually engages (and that depends on your mainbard speed) when printing or moving at high speed. This will result in some crude speed limiting kicking in in the stepper module which will slow your print and if you also have LINEAR_ADVANCE enabled, it will not know about the speed change and will overextrude.
I was browsing around stepper.cpp looking for other things that might be contributing to ISR running to much.
I wonder how accurate ISR_S_CURVE_CYCLES needs to be?
The STM32G0 is a cortex-m0 and missing the umull instruction so it's falling back to the unoptimized logic. ISR_S_CURVE_CYCLES is hard coded to the 40-ish cycles of the optimized version though and I think this board is running closer to 200 cycles.
I looked at the assembly generated and the 3 64-bit multiplications are causing _aeabi_lmul() to be called and that function is 40 instructions. I've seen versions of that function done in 25 instructions so probably room for long term improvement.
Short term I'm wondering if its worth updating the define to use a higher value for this board?
I think that makes sense (as a separate PR!). Make it depend on the same condition, i.e. the opposite of #if (defined(__arm__) || defined(__thumb__)) && __ARM_ARCH >= 6 && !defined(STM32G0B1xx)
I retested with is_improvements branch and INPUT_SHAPING+ADAPTIVE_STEP_SMOOTHING+S_CURVE_ACCELERATION and unfortunately homing still was not working. It was still grinding into end stop and appears to reboot from watchdog after a couple of seconds.
I even tested with and without my update to scurves cycles from 40 to 500; hoping that would back off the CPU usage enough; but no difference.
I did finally hook up my MAX7219 and used it during this time. It appears that the instant the homing started that the display also stopped being updated (7 out of 8 LED's light on profiling and no LED's changed and the alive led stopped blinking).
Once I reinstall without ADAPTIVE_STEP_SMOOTHING then things go back to normal. It stays very idle during homing (7 out of 8) but I get hints of some kind of bursty operations going on (at very end of Z home, I get a blip down to only 1 led for example).
In case it helps, here is value of ISR_EXECUTION_CYCLES for these boards (with my +500 cycles for SCURVE instead of +40). I'm still trying to understand the usage of the CPU frequency divided by some fixed # since it will result in a smaller cycle count the slower your CPU. I'd think the value would be used to multiple against some fixed cycle count and not simply added in.
Marlin/src/feature/../module/stepper.h:242:1: note: #pragma message: ISR_EXECUTION_CYCLES: (((792UL + 500UL + (((4UL + _MAX(_MAX(uint32_t((64000000L) / (5000000)), ((64000000L) / 500000UL) * (1UL)), (0UL + 16UL + 16UL + 16UL + 16UL))) * (1 + 1 + 1))) * (1) + 64UL + _MAX(_MAX(uint32_t((64000000L) / (5000000)), ((64000000L) / 500000UL) * (1UL)), 16UL))) / (1))
I now have an SKR Mini V3 so I can play around with this myself. How did you connect your MAX7219, i.e. which connector and what config?
There is a set of pins labelled SPI1 you can get 5V and GND from but the remaining pins are also used by SD card so you can't use them freely.
There is also a row of 5 pins labeled I/O that are free to use and I used the 3 closest to SPI1 power pins.
#define MAX7219_CLK_PIN PD3
#define MAX7219_DIN_PIN PD5
#define MAX7219_LOAD_PIN PD4
Okay, I have Marlin on my BTT SKR Mini E3 V3.0 with INPUT_SHAPING, ADAPTIVE_STEP_SMOOTHING and S_CURVE_ACCELERATION and it runs fine with G28. My MAX7219 is connected and shows that the processor is fairly quiet during homing.
What processor does your board have? Mine has the STM32G0B1RET6.
I do not have a UI. I wonder whether that might make a difference.
Could you post your config?
Interesting. Original reporting and I both have Ender 3 Pro's so using that style UI.
The original poster posted a config zip at top and here is link to repo I keep my config in as well. My default config has ADAPTIVE_STEP_SMOOTHING disabled so please enable if you start from that; but it does also have MAX7219 enabled with above pins.
One other thing I didn't have compiled was a probe. So I plugged in my new CRTouch and it burnt! Damned Z probe connector on the BTT was backwards. So dispiriting.
Frustrating! I would be happy to donate a new one as thanks for all the new features.
So it turns out the probe was not related to the problem anyway. But the display is, sort of. When I disable CR10_STOCKDISPLAY homing no longer triggers the watchdog timer. The stepper ISR is taking pretty much all the processor time and just updating the display pushes it over the edge.
I have also established that both your changes and mine together change MIN_STEP_ISR_FREQUENCY from 27681 to 17021 which is less than the factor of 2 that is needed to give the CPU spare cycles. Even when I force ASS to run at half the frequency, by moving CodeProfiler from idle()
MarlinCore.cpp to Stepper::isr()
I can see that too much CPU is being used by the ISR.
So the ASS calculation is far too inaccurate for the STM32G0B1RE. I wonder how much Stepper::calc_timer_interval() contributes to this. It calls __aeabi_uidiv()
, I think, from looking at the assembly whereas with a STM32F103RE, there's just a udiv
.
I would be happy to donate a new one as thanks for all the new features.
That's very kind, Chris, but it was already a spare which was donated to me and I was happy to have it on that basis for testing purposes. My actual printer uses a different sensor so I don't really need it for myself. It's just frustrating to have a perfectly good piece of hardware destroyed by one incompetent component placement in another piece of hardware.
@cbagwell I have measured __aeabi_uidiv()
at about 136 cpu cycles vs udiv
which takes up to 12 (I think from a brief google). So can you add about 120 for this to #24955?
Also, I measure the call to the SCURVE logic at 544 cycles. So that's another 44.
I am still measuring but there is still quite a lot more time taken in the ISR than is being counted.
Will do.
__aeabi_uidiv()
is showing up as __udivsi3
in objdump. Let me stare at the code calling that for a bit and find a good place to account for those extra cycles.
I also see some calls to __aeabi_ldivmod
in IS paths and those look like they could be at least as expensive as __aeabi_uidiv()
.
I also see some calls to
__aeabi_ldivmod
in IS paths and those look like they could be at least as expensive as__aeabi_uidiv()
.
I suspect that will be
TERN_(HAS_SHAPING_Y, advance_dividend.y = ((steps << 29) + shaping_y.remainder) / step_event_count);
and the same for X. I have a change in the pipeline that will eliminate this code. Just need to resolve the SKR Mini and TMC2208 issues first.
PS these divisions only happens once per segment so I am not overly concerned about them.
Hmm, pulse_phase_isr()
takes quite a lot more time than is planned for. ISR_STEPPER_CYCLES
is 16 but I measure about 90 cycles per axis.
I suspect DelayQueue::peek_x()
and DelayQueue::peek_y()
are also quite expensive.
SCURVE, on the other hand, has a much lesser effect because it only occurs during accel and decl ramps.
And it turns out using a float as a boolean is really expensive.
I'll fix some of these and see where we end up.
Hopefully, this patch will one day land in gcc to be better optimized for cortex-m0: https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg294320.html
We are just checking off each of those functions one-by-one in this ISR. It's still a long, long way off until that would be visible in platformio's gcc though.
It, for example, optimized __aeabi_lmul
from 40-ish cycles down to 30 cycles. __aeabi_uidiv
its still between 72 and 385 cycles so may not be any better than today.
@cbagwell have a look at https://github.com/tombrazier/Marlin/tree/scurve_stm32g which prompts the compiler to use MULS
where possible in the SCURVE fallback function rather than calling __aeabi_lmul
. I think it's right but it needs testing.
Clever. I was concentrating on how to optimize the original assembly and how small I could make a 32x32=64. I didn't consider optimizing the C path and taking advantage of the 32bit shift. Probably by using assembly, we can also use STM32G0's 32x32=32 instruction and feed it the 16bit values to reduce a bit more.
I'll give it a test.
I did not have as much free time as I hoped today so I tested it as-is. I rebased my config branch to latest bugfix-2.1.x then added is_improvements+scruve_stm32g and my scurve-cycles-cortexm0 branch. I also enabled ADAPTIVE_STEP_SMOOTHING+S_CURVE_ACCELERATION+INPUT_SHAPING.
The good news is that I could successfully home with no reboots so I think we are on the right track with increased cycle counts and improving scurve speeds. I notice the MAX7219 has what looks like a 1.5 second freeze right when I initiate homing and also again after X/Y is done/moves to center of bed and is getting ready to lower the probe. Might be something to investigate deeper later on as the freeze might explain some probe failures.
I also successfully printed a hollow calibration cube I use for flow tests. It prints slooow because of Cura defaulting to nothing faster than 10 seconds a layer.
I then printed the ringing print from input shaping pull request (I did all the stuff to get full speed). This is the bad news. That print had some issues. The print would freeze every once in a while for like 60 seconds and then just resume on its own for unknown reasons. I also had severe under extrusion.
I need to do some troubleshooting tomorrow to narrow down the cause since this wasn't a small chance since my last successful print.
I reverted the scurve_stm32g change and the occasional 60 second pauses went away.
I still have under extrusion on my ringing tower at 90mm/s but I'm now thinking that might be related to my extruder stepper motor going bad earlier this month and I replaced it with a backup stepper+extruder that I've never been happy with and likely needs more tuning.
I'll switch back to testing at normal speeds until I get that worked out.
Shouldn't this:
#define MUL_F_T do { \
f = uint16_t(f >> 16) * uint16_t(t >> 16) + \
((uint16_t(f >> 16) * uint16_t(t)) >> 16) + \
((uint16_t(f) * uint16_t(t >> 16)) >> 16); \
}while(0)
be this (high part shifted back up to high part)?
#define MUL_F_T do { \
f = ((uint16_t(f >> 16) * uint16_t(t >> 16) << 16) + \
(uint16_t(f >> 16) * uint16_t(t) + \
(uint16_t(f) * uint16_t(t >> 16); \
}while(0)
I don't think so. You need 2 16 bits shifts to make up the 32 bit shift. I have not tested this logic to see that it gives the same answers. It should be almost exactly the same with an additional tiny rounding error.
I see where I went wrong. I was thinking about doing a 32bitx32bit=64 followed by >> 32 like this:
(((uint16_t(f >> 16) * (uint16_t(t >> 16) << 32) + \
(((uint16_t(f >> 16) * uint16_t(t) + \
(uint16_t(f) * uint16_t(t >> 16)) << 16) + \
(uint16_t(f) * uint16(t))) >> 32;
and since we know we are going to do that shift why not skip the full 32-bit shift in first place. I somehow translated that to still doing a 16 bit shift. For what I was thinking to work, it would reduce to this:
(uint16_t(f >> 16) * (uint16_t(t >> 16)
In my quick gcc test on Intel, all 3 versions do not match original (unit64_t(f) * t) >> 32 as close as I'd assumed they would. So I'll play around a bit more to understand why.
My first version throws away the product uint16_t(f) * uint16_t(t)
and saves one MULS
. This introduces a small rounding error which is of the order of 2^-31 when the 32 bit number is converted back to the range 0 to 1.
I got a new stepper motor and now my under extrusions are gone. I'm also sure I did something wrong that caused the long pauses so I started fresh with latest is_improvements+scurve_stm32g branches and my scurve-cycles-cortexm0 branch. I also enabled ADAPTIVE_STEP_SMOOTHING+S_CURVE_ACCELERATION+INPUT_SHAPING.
I do not normally have ASS enabled but this was to stress the board while doing IS ringing tower tests. This test run was the best looking and most stable print of a ringing tower to date on this board.
This board has some homing issues that pre-date IS being merged in (it would sometimes grind into an end stop and that was with ASS disabled). With current is_improvements, I am now unable to reproduce those grinding issues and it used to be pretty easy to reproduce.
And finally, I reviewed the assembly from scurve_stm32g and it dropped from 500 cycles down to 257 cycles. Nice reduction. There are 3 64-bit multiplications left and they are accounting for 170 of those remaining cycles.
This was fixed by #24951 .
Did you test the latest
bugfix-2.1.x
code?Yes, and the problem still exists.
Bug Description
The firmware resets early in the homing process. If the axes started near their zero point, the firmware actually acts as though the switches weren't there, continuing to grind into the zero point until the firmware resets. But if the axes start far from their zero points, the firmware will reset anyway, a few seconds after they begin to move, even if they haven't hit their endstops yet.
Bug Timeline
This started on 10/24/2022, not long after Input Shaping merged in.
Expected behavior
I expected the homing process to complete normally.
Actual behavior
If the axes start close to the zero point, they will grind against the endstops for a second or two before the firmware resets. If the axes start far from the zero point, the firmware still resets a second or two after movement begins, even if they have not reached the zero point.
Steps to Reproduce
Version of Marlin Firmware
bugfix-2.1.x, 2022-10-24
Printer model
Creality Ender 3 Pro
Electronics
BigTreeTech SKR Mini e3 v3
Add-ons
BLTouch 3.1
Bed Leveling
UBL Bilinear mesh
Your Slicer
Cura
Host Software
OctoPrint
Don't forget to include
Configuration.h
andConfiguration_adv.h
.Additional information & file uploads
Marlin.zip