AnHardt / Marlin

Reprap FW with look ahead. SDcard and LCD support. It works on Gen6, Ultimaker, RAMPS and Sanguinololu
GNU General Public License v3.0
1 stars 1 forks source link

A jurny into the depth of `MULTY_STEPPING` and `ADAPTIVE_STEP_SMOOTHING` #94

Closed AnHardt closed 4 months ago

AnHardt commented 3 years ago

and why i think both are currently broken.

Under construction !!!

What do these features do? Both of them try to keep the rate of stepper-interrupts between limits.

MULTY_STEPPING cares about the upper limit. A stepper interrupt lasts some time. In that time no other code can run. Even it can't run while it is already running. So there is a upper limit how often we can run the stepper-interrupt per time. When we do only one step per stepper-interrup this also limits the possible steprate. The idea of MULTY_STEPPING is to do more than one step per stepper-interrupt to enable the system to reach higher steprates. This works because doing more than one step per interrupt saves some overhead. With for example 'DOUBLE_STEPPING' each of the single stepper-interrupts (now doing two steps) takes longer than before but less than doing two stepper-interrupts with 'SINGLE-STEPPING'. Doing more steps per interrupt saves more time than doing less.

ADAPTIVE_STEP_SMOOTHING cares about the lower limit. When there are only a few stepper-interrupts per time, the CPU is idling most of the time, doing nothing useful. Instead of this it increases the number of stepper-interrupts and does less than one step per stepper-interrupt while increasing the resolution. In a Bresanham line drawing algorithm we have always one leading axis, that with the most steps to go in that line/move. This axis is normally steppt in each of the stepper-interrupts and the time between the steps is calculated for that axis. If now the relation of the steps per axis to do is 'uneven' like 100 to 75 (0.75) the pattern for the steps looks about that way:

X: 1111111111111111 -> 16
Y: 1101011010110101 -> 10

Because the algorithm calculates in integer no half steps can be done. We have to alter between doing two and one steps at a time. That can be noticed - the stepper with the shorter way is running a bit rough. The step relation cant match (0.62) When doubling the resolution the pattern for the same line looks like:

X: 10101010101010101010101010101010 ->16
Y: 10010010010010010010010010010010 ->11

The relation (0.69) can be represented better and the rhythm for the shorter axis is now much more uniform. The higher the resolution the better the representation of the relation and the more uniform the rhythm.

So if the rate of stepper-interrups is below a lower limit we can double the amount of stepper-interrupts to double the resolution - to 'smooth the steps' (of the not leading axes).

What do we pay with? For MULTY_STEPPING we pay with the opposite effect of ADAPTIVE_STEP_SMOOTHING. The rhythm of the steps becomes less uniform. For the leading axis where the time between the steps have been completely uniform when 'SINGLE_STEPPING' we get blocks of multiple steps, following each other as fast as the stepperdivers allow. For the stepper motors this appear as if the field is jumping several micro-steps at once. The stepper will run as if a lower micro-stepping would have been selected. Stepper-drivers like the TMCs, interpolating to up to 256 micro-steps, have to predict the time to the next step-pulse they will see. They do that from measuring the time between the last steps. This fails when the rhythm is not uniform. Doing more micro-steps at once than representing one (or two) full steps results in stepp losses. For ADAPTIVE_STEP_SMOOTHING we pay with higher reaction times of other processes and/or interrupts. While the stepper interrupt is running nothing else does happen. Even lower priority interrupts are interrupted for the stepper-interrupts on the STs (not so on the AVRs). All things are suspended if not eiter in hardware, like hardware-PWM or DMA. We have to care about leaving at least some time for doing other things. Else the system becomes increasingly unresponsive until either a segment with a lower interrupt rate is worked on or the hardware-watchdog bites because its last refresh was to long ago.

How is this implemented? Let's walk thru the relevant parts of stepper.cpp and stepper.h of the current Marlin-2.0.x. (In the hope this will not change as fast as bugfix-2.0.x with the risk the laser changes did change here relevant parts of the code (I don't think so)) Let' begin with the definition of some numbers of cycles for different parts of the stepper-interrupt for some types of CPUs. There is hopefully not that much to say about. Some estimations - hopefully at the high side. But not very diversificated - for example it lasts longer to save the F4s registers when the MPU is used because there are some more of it. Then the number of processor-cycles per step in a stepper-interrupt taking R steps at once is defined. Followed by some defines defining the maximum step-frequency (for one step) when stepped in a stepper-interrupt doing 1 to 128 steps at once. Here the most interesting one for ADAPTIVE_STEP_SMOOTHING is MAX_STEP_ISR_FREQUENCY_1X - one step per stepper-interrupt. The next lines define a MIN_STEP_ISR_FREQUENCY. There is a comment this should be 10% of the full processor load but defined here is 100% of the full load. So with MIN_STEP_ISR_FREQUENCY = MAX_STEP_ISR_FREQUENCY_1X the processor can't do anything else but stepping. If the numbers for the cycles of the parts of the interrupt have been estimated to low stepping at MIN_STEP_ISR_FREQUENCY will load the processor with more than 100%. Let's see if that can happen later.

Let's have a look into the startup phase of the stepper-interrupt where a new block was just grabbed and we prepare some local variables we want to work with later - the calculation of oversampling_factor in stepper.cpp what later is used to shift left (increase) the steprate to calculate the counter-values for the timer to call the next interrupt. Locally oversampling is used to shift some counters left (increase). These values are used for the whole block, including the acceleration- and break- phases. So with what load do we end up? We have a while-loop until (max_rate < MIN_STEP_ISR_FREQUENCY) with a additional check if the new doubled max_rate is exceeding MAX_STEP_ISR_FREQUENCY_1X. Only while both is true we increase oversampling. Because MIN_STEP_ISR_FREQUENCY = MAX_STEP_ISR_FREQUENCY_1X = maximum possible system load we will end up with a number using somewhere in between of 50% to 100% of the processors power - depending on the exact value of current_block->nominal_rate. So for the fix "slow probing speed" always the same amount as long we don't change it by will. The processor load during probing is "randomly" determined by the processor, its frequency and the probing speed and can be somewhere from 50 to 100%. (If we messed up the estimation for the cycles, maybe more than 100%. (50+x to 100+2x %))

How to fix ADAPTIVE_STEP_SMOOTHING? In the light of how well it currently works, limiting MIN_STEP_ISR_FREQUENCY to only 10% of MAX_STEP_ISR_FREQUENCY_1X as the comment suggests seems to be extremely conservative

Dynamic approach Relation to SLOWDOWN

Fix for MULTY_STEPPING