repetier / Repetier-Firmware

Firmware for Arduino based RepRap 3D printer.
812 stars 734 forks source link

0.92-work: Homing too slow and other homing errors on high rez delta #313

Open kyrreaa opened 10 years ago

kyrreaa commented 10 years ago

Seems like something is off on the high rez delta now. 400 steps/turn + 16 microsteps on 415mm long travel gives homing that gives up half way. It also goes on to slow home too soon. When it does not reach any endstops it ends the move and sets home position regardless. The homing speed is also not what I set either in eeprom or in firmware.

Multiple issues here... Probably all related. Eeprom homing feedrate is set to 100mm/s but actual rate is much much lower. Re-home speed is set to 1/3.

kyrreaa commented 10 years ago

Found the problem for the homing issue. In config file the speed for motion is mm/s but homing is mm/min. In Repetier Host eeprom edit the scale is listed as mm/s but is still interpreted as mm/min in firmware. Either a consistent mm/s should be used, and firmware needs updating or repetier host needs a tweak.

kyrreaa commented 10 years ago

Whoops! Homing on high rez printer still doesn't do a long enough travel if homed from close to glass! It will move up and then go over to a slower move before stopping. It then registers homing without ever finding an endstop. Result is a printer that "thinks" it's homed but is far from so. Head-crash follows on start of job...

kkronyak commented 10 years ago

I have been doing some testing with this and similar issues -- it appears to be due to a potential overflow on certain operations using 16-bit integers. I haven't narrowed the exact code down yet but I've found if I try to perform a move which results in more than 65535 steps I end up getting some odd behaviors. My machine is running with 0.9 motors on 1/32 microstep so I am at 400 steps / mm. A short-term work-around would be to reduce the microstep setting. If converting the int16 to int32 is undesirable as a standard code feature (I realize memory on AVR is very limited) maybe using a config option to determine integer size would be useful.

kyrreaa commented 10 years ago

I found this a few days ago.

I posted it in the thread I referenced.

Copy/paste:

Found it! I tried printing out virtual_axis_move and it was wrong too!

The line a bit up: int32_t virtual_axis_move = maxDeltaStep * segmentsPerLine; It calculates with int16 and int16, but has no cast. To fix, change it to: int32_t virtual_axis_move = (uint32_t)maxDeltaStep * segmentsPerLine;

This also fixes p->stepsRemaining.

This refers to repetiers post:

I think it is line 1665 in motion.cpp p->stepsRemaining = p->numPrimaryStepPerSegment * segmentsPerLine; it results in 20206 virtual steps, but should be 85741 in your case. So it was handled as 16 bit multiplication, also numPrimaryStepPerSegment is 32 bit, which normally should lead to a 32 bit result, especially since stepsRemaining is also 32 bit. So maybe it is even a compiler bug that it did not convert to 32 bit.

Can you try: p->stepsRemaining = p->numPrimaryStepPerSegment * (int32_t)segmentsPerLine;

which hopefully returns 85741 as virtual axis steps. The used 20206 / 140 = 144mm match very good your description. So now we have to learn how to return a 32 bit result.

As you can see...

It is a integer overflow causing both wrong speed to be used and wrong distance to be moved. It does however not fix the fact that some parameters are in mm/s and others in mm/min and the repetier interface is not always correct on the eeprom descriptions. It may get it's descriptions from the firmware for all I know, in which case the firmware is wrong too. Anyway, my fix is confirmed to solve the issue of short homing and homing.

kkronyak commented 10 years ago

I wanted to provide a verification that the above fix did in fact resolve the issue. I had some other issues with my configuration but I believe it was due to going from 0.91 to 0.92 without clearing EEPROM, as the same issues reoccurred on 0.91 when reverting back, and clearing EEPROM resolved them. Other users experiencing issues going from 0.91 or earlier to 0.92 may want to ensure the eeprom is cleared before performing any testing.

sigma957 commented 9 years ago

Hi,

This is my first attempt to contribute here. First, let me share how excited I am about this project and its community. Great to have you around, guys :)

I have a high rez delta with 0.92. Hardware setup is Due+RADDS and 400 steps/turn with 128 microsteps on all axes. What I am observing is that homing along 300mm height manages to trigger the watchdog. This behavior began to manifest itelf when I started using the new 0.92 profile for Arduino IDE, which enables the watchdog properly. Meddling with stepper high delay definitely affects the behavior (0 saves time and triggers occasionally here, while a value of 2 makes it hit the problem all the time). I have tried a number of remedies, went all over the motion calculations as suggested above but did not track the issue there. Tried to re-implement everything with floats, enabled/disabled 64-bit support, did not help. IMO there is no visible issue with overflows in the motion calculations anymore.

My current solution is to invoke HAL::pingWatchdog() in Printer::endXYZSteps(). This seems to work fine and avoid excessive pinging during movements. I tried to find a cleaner place, e.g. delayMicroseconds but it generated a bunch of other problems like LCD timing, timing not exact (due to C code, etc.), so I decided to follow repetier's pattern and not call pingWatchdog in the microsecond-timing routine.

Two questions:

  1. Is this behavior normal for my setup or I need to re-configure my double/quad step parameters in order to avoid the watchdog on homing (step doubler frequency is set to 90,000)? As I see it currently, there simply isn't enough time to avoid the 4 sec watchdog interval in this scenario. Making the watchdog interval larger (1024u=8sec) also works but is not a safe-enough solution for me -it did trigger once or twice.
  2. Is this fix valid or I have gone totally the wrong way?

Any comments will be appreciated. Many thanks in advance.

P.S. repetier, thank you for this excellent firmware. Kudos.

kyrreaa commented 9 years ago

I think the problem is simply that the mcu is using all it’s available time in step interrupts and not executing much main loop commands at all. This causes the watchdog to be triggered on these high steprate long moves. I’ve seen the issue on 400 steps/turn motors with only 1/32 microstep using a velocity of around 200mm/s so I can imagine very low velocity is needed to trigger it at 1/128.

From: sigma957 Sent: Sunday, January 11, 2015 9:58 PM To: repetier/Repetier-Firmware Cc: kyrreaa Subject: Re: [Repetier-Firmware] 0.92-work: Homing too slow and other homing errors on high rez delta (#313)

Hi, I have a high rez delta with 0.92 with 400 steps/turn and 128 microstep. What I am observing is that homing along 300mm height manages to trigger the watchdog. This behavior began to manifest itelf when I started using the new 0.92 profile for Arduino IDE, which enables the watchdog properly. I have tried a number of remedies, went all over the motion calculations as suggested above but did not track the issue there. Finally I saw how HAL::pinghWatchdog() is used throughout the code and added to Printer.h::endXYZSteps(). This seems to work fine. I tried to find a cleaner place, e.g. delayMicroseconds but it generated a bunch of other problems like LCD timing, timing not exact (due to C code, etc.).

My question is: is this behaviour normal or I need to better setup my double/quad step behaviour in order to avoid the watchdog on homing? As I see it currently there is simply not enough time to avoid the 4 sec watchdog interval. Making the interval larger also works but is not safe enough solution for me.

Any comments will be appreciated. Thanks.

— Reply to this email directly or view it on GitHub.

sigma957 commented 9 years ago

Yes, I did reproduce it with 32 microsteps. I came to a somewhat similar conclusion but instead of looking a way to execute some main loop code that will ping the watchdog I decided to do it directly in order to insert minimal delay in the movement handlers.

I am not that familiar with the firmware that's why I asked if my fix is the best approach to take. This is bothering me big time and I would love to see it fixed, be it my way or some other way.

repetier commented 9 years ago

First endXYZSteps is one of the most frequently called functions. It is called after avery step, so it is total overkill. Normally the watchdog is called from the temperature manager which should be called 10 times per second. Given that watchdog needs a ping every 4 seconds that should be no problem.

Homing has a wait for the end of move, which calls

void Commands::checkForPeriodicalActions(bool allowNewMoves)

while it waits and that calls Extruder::manageTemperatures();

If I take my 80 steps/m for 1/16 steps * 16 = 1280 steps/mm I assume this is what have plus minus motor pulley diameter. For me total overkill, but it will move very smooth I guess. That allows you 70mm/s. That can give you 4.3 seconds with bad luck if no ping occurs.

The wait will call checkPerodicalActions and only if executePeriodical is 0 it will not ping watchdog. That gets set in HAL.cpp inside PWM_TIMER_VECTOR every 390 calls = 0.1s.

I do not see why it is not called in time if you use safe speeds. 90-95khz should work if you have ne big stepper high delay. Otherwise you might in deed block all CPU power allowing no execution of relevant timers/main thread. So first try is reduce homing speed to 50. Second try is add ping into void Commands::waitUntilEndOfAllMoves() which is way better then endXYZSteps.

kyrreaa commented 9 years ago

The clue here is safe speeds. I have my machine limited to safe speed for my 140 steps/mm setup, but I also use 20 MHz clock instead of 16 (overclocked) to give me more headroom. I limit motion to 250mm/s which gives a maximum of 35000 steps/sec. Using quad-stepping this is barely doable on the Mega2560 based platform.

At 400 steps/turn and 128 microsteps assuming normal pulleys of 9 tooth 5.08 pitch you get 1119,86 steps/mm. The comparable limit to my printer assuming overclock would then be 31mm/s. Anything higher will be jittery and possibly cause timeouts, even with overclock.

Optionally a Due platform may help.

I agree that calling some of these checks too often is bad, but a code to check if such a check should be performed is also costly. Re-visiting home after homing (at slower rate) makes little sence if the checks are done equally often though. A fast switch between every N'th step vs always could work for normal travel vs homing?

Kyrre

On Mon, Jan 12, 2015 at 2:54 PM, repetier notifications@github.com wrote:

First endXYZSteps is one of the most frequently called functions. It is called after avery step, so it is total overkill. Normally the watchdog is called from the temperature manager which should be called 10 times per second. Given that watchdog needs a ping every 4 seconds that should be no problem.

Homing has a wait for the end of move, which calls

void Commands::checkForPeriodicalActions(bool allowNewMoves)

while it waits and that calls Extruder::manageTemperatures();

If I take my 80 steps/m for 1/16 steps * 16 = 1280 steps/mm I assume this is what have plus minus motor pulley diameter. For me total overkill, but it will move very smooth I guess. That allows you 70mm/s. That can give you 4.3 seconds with bad luck if no ping occurs.

The wait will call checkPerodicalActions and only if executePeriodical is 0 it will not ping watchdog. That gets set in HAL.cpp inside PWM_TIMER_VECTOR every 390 calls = 0.1s.

I do not see why it is not called in time if you use safe speeds. 90-95khz should work if you have ne big stepper high delay. Otherwise you might in deed block all CPU power allowing no execution of relevant timers/main thread. So first try is reduce homing speed to 50. Second try is add ping into void Commands::waitUntilEndOfAllMoves() which is way better then endXYZSteps.

— Reply to this email directly or view it on GitHub https://github.com/repetier/Repetier-Firmware/issues/313#issuecomment-69572314 .

sigma957 commented 9 years ago

Thank you, kyrreaa. I am indeed using Due, that's what this is all about. If it was a Mega2560, I wouldn't bother at all with such high speeds. Seems like I've been too greedy :)

repetier, my homing speed is indeed too high (350). You are right, I see now that this scenario is an overkill. Seems like I've pushed too far in my attempt to reach machine limits. Thank you for the advice, I should've done my math properly in the first place. I will start all over from 50mm/s and let you know about the results. Hopefully it won't come to pinging the watchdog again.

Thanks once again.

Cheers.

sigma957 commented 9 years ago

Hi again. I experimented a bit with feedrates and doubler frequency and managed to configure the firmware so that the watchdog is not triggered at 300mm/s max. Doubler is set to 10KHz, stepper high delay to 0 (my Silencioso drivers seem to work ok without it). I guess this means that my Due is in the same trouble as an AVR processor with 32 microsteps, right?

repetier, for the record, the fix you suggested did not help at unsafe velocities. Therefore I decided to take the wisest course and stop meddling with the watchdog. I consider my current configuration as pretty stable except the max feedrate, which I will limit to 150mm/s, maybe even less.

Can someone please explain how to hit the sweet spot between doubler frequency and max feedrate? I took a look at the code and it seems to me that double/quad steps are not associated with any drawbacks, am I right? Are there cases where I should prefer lower vs higher doubler frequency? Can double/quad steps produce positioning inaccuracies vs ordinary steps?

And last a theoretical question - I wonder what was the reason (in historical terms) to introduce double and quad steps at all? If it is what I suspect - will introducing octa-steps help to keep doubler frequency at 90khz for the Due as you suggested in another issue without compromising top feedrates?

Last, one strange quirk worthy of reporting. There were a few cases where the watchdog was triggered only when executing G28, but the G1 worked ok (both moving end-to-end along Z axis). I don't seem to remember the exact config so it may be my fault somehow, but I still need to ask - is there so big difference in the execution of G28 as opposed to G1? I have always assumed that G28 is executed internally the same way as G1 is?

Thanks in advance for your answers.

sigma957 commented 9 years ago

Please read 200mm/s instead of 300mm/s. My typo, sorry.

repetier commented 9 years ago

Double and quad stepping was introduced, because the timer interrupt overhead is so high that speeds higher 12khz were not possible with AVR. So what we do is if speed is high we combine multiple steps in one interrupt, so interrupt count is still < 12khz but stepper frequency is 40khz and more.

Combining steps in nearly the same as reducing stepper resolution. So it makes only sense if you want faster travel moves and print with lower speeds which do not trigger double stepping. If you print with double stepping anyway, reduce stepper resolution and put off some load from the cpu, same result.

Now the due is much faster. On my logic analyser I could see 95KHz with single stepping. But then most time is also used in interrupt. So now you can really go twice as fast as the avr with quadstepping and get proper positionings. Adding doublestepping at that rate is hard with delays. Assume 1us for distance between double steps and 1us for the 2 highs and 2us for interrupt routine then you can only get 200KHz. I think truth might be even worse.

G28 is different to G1 as it waits for the end of the move while G! does not.

sigma957 commented 9 years ago

Thanks. Could you please elaborate a bit more regarding why combining steps is similar to reducing stepper resolution? I thought this is done only to lower interrupt frequency and cpu load, do you mean that combining steps on the time axis introduces aliasing?

repetier commented 9 years ago

That is quite easy. If you have 1 step every second quadstepping makes it 4 fast repeated steps and 4 seconds wait. So you get longer wait intervals and short intervals with more steps. Target of double/quadstepping is to get these gaps long enough to make computations of the interrupt in time, so that at least overall timing is correct.

sigma957 commented 9 years ago

Thank you :)

lkarlslund commented 9 years ago

I have a new large delta (radius 550mm x height 620mm work area) running 0.92 on a RUMBA board (Atmel 2560). I have 160steps/mm (DRV8825 32-stepping), and going much faster than 100mm/s gets the watchdog triggered.

When the watchdog is triggered, the RUMBA board bootloops with the indication that the onboard LED on PIN13 flashes rapidly. Pressing the hardware RESET button does not clear the watchdog, and the only way out of this mess is by powering the board completely off and back on.

Also for complete disclosure I have a full graphic controller attached.

Disabling watchdog solves the problem, but with a 1500W heated bed, I'd rather not have is disabled.

So two questions:

Big fan of Repetier - fantastic work!

jamesarm97 commented 9 years ago

Sounded familiar, don’t know if it is related (from Atmel forum):

when a watchdog reset occurs, the watchdog timer stays enabled (as described on p.52 of the atmega 168 datasheet) - and this leads to the watchdog resetting again and again in the bootloader, requiring a hard powerdown by the user.

the current ATmegaBOOT_168.c now has the WATCHDOG_MODS part from the lilypad bootloader - but i think whether or not that is being used (skipping the bootloader code early on after a watchdog reset), the watchdog registers should be cleared anyway.

the attached patch is against 0013, but it should still apply with some offset to the latest (0017) source

On Jan 23, 2015, at 11:44 AM, lkarlslund notifications@github.com wrote:

I have a new large delta (radius 550mm x height 620mm work area) running 0.92 on a RUMBA board (Atmel 2560). I have 160steps/mm (DRV8825 32-stepping), and going much faster than 100mm/s gets the watchdog triggered.

When the watchdog is triggered, the RUMBA board bootloops with the indication that the onboard LED on PIN13 flashes rapidly. Pressing the hardware RESET button does not clear the watchdog, and the only way out of this mess is by powering the board completely off and back on.

Also for complete disclosure I have a full graphic controller attached.

Disabling watchdog solves the problem, but with a 1500W heated bed, I'd rather not have is disabled.

So two questions:

Any idea as to how to solve the bootloop - burn another bootloader that clears watchdog? Is the 32-microstepping braindead (will the Repetier firmware combine them?). If so, how do I find the optimum microstepping setting? Big fan of Repetier - fantastic work!

— Reply to this email directly or view it on GitHub https://github.com/repetier/Repetier-Firmware/issues/313#issuecomment-71222945.

jamesarm97 commented 9 years ago

https://code.google.com/p/arduino/issues/detail?id=181 https://code.google.com/p/arduino/issues/detail?id=181

On Jan 23, 2015, at 11:47 AM, James Armstrong jamesarmstrong3@me.com wrote:

Sounded familiar, don’t know if it is related (from Atmel forum):

when a watchdog reset occurs, the watchdog timer stays enabled (as described on p.52 of the atmega 168 datasheet) - and this leads to the watchdog resetting again and again in the bootloader, requiring a hard powerdown by the user.

the current ATmegaBOOT_168.c now has the WATCHDOG_MODS part from the lilypad bootloader - but i think whether or not that is being used (skipping the bootloader code early on after a watchdog reset), the watchdog registers should be cleared anyway.

the attached patch is against 0013, but it should still apply with some offset to the latest (0017) source

On Jan 23, 2015, at 11:44 AM, lkarlslund <notifications@github.com mailto:notifications@github.com> wrote:

I have a new large delta (radius 550mm x height 620mm work area) running 0.92 on a RUMBA board (Atmel 2560). I have 160steps/mm (DRV8825 32-stepping), and going much faster than 100mm/s gets the watchdog triggered.

When the watchdog is triggered, the RUMBA board bootloops with the indication that the onboard LED on PIN13 flashes rapidly. Pressing the hardware RESET button does not clear the watchdog, and the only way out of this mess is by powering the board completely off and back on.

Also for complete disclosure I have a full graphic controller attached.

Disabling watchdog solves the problem, but with a 1500W heated bed, I'd rather not have is disabled.

So two questions:

Any idea as to how to solve the bootloop - burn another bootloader that clears watchdog? Is the 32-microstepping braindead (will the Repetier firmware combine them?). If so, how do I find the optimum microstepping setting? Big fan of Repetier - fantastic work!

— Reply to this email directly or view it on GitHub https://github.com/repetier/Repetier-Firmware/issues/313#issuecomment-71222945.

repetier commented 9 years ago

@lkarlslund Normally bootloader should disable watchdog flag which newer Arduino do. Not sure what Rumba bootloader does. A solution might be setting watchdog to 4 seconds in HAL.h

inline static void startWatchdog()
{
    wdt_enable(WDTO_1S);
};

I think 4S is the maximum here.

Repetier will use the 32 microstepping. Only thing is with your size you are most probably breaking the 65000 steps diagonal size so that it can not use the fast integer math and switches to float which eventually reduces the real resolution. At least we hat a discussion if you loose precision or not but outcome was not clear. To be sure you have to use 64 bit integer math, which is only supported on the Arduino Due + RADDS or FD RAMPS board. That would also allow higher stepper speeds without using double/quad stepping which makes positioning more like 1/16 or 1/8 microsteps if you hit that speeds.