new dda_queue issues - Githubissues

Wurstnase commented 7 years ago

Hi all,

looks like we need to investigate a little bit more time into the new code. http://forums.reprap.org/read.php?147,33082,732789#msg-732789

addon

board file: https://gist.github.com/Wurstnase/2a034ed8680162f345eb9de88ff2458b printer file: https://gist.github.com/Wurstnase/cb4abcb22ca4d6f43a0094fc2ab5579d gcode: https://gist.github.com/Wurstnase/f24c7bb3a58ddc553ae83f918f607231

Wurstnase commented 7 years ago

While searching this issue I've started to use the simulator. For any step done in dda_step I have a variable which counts this directly with += get_direction(dda, axis).

While this works on experimental, I've got a divisionzero error on master (floating point exception core dumped), which disappeared with "DDA: don't count individual axis steps."

phord commented 7 years ago

I found some div-zero exceptions in simulator recently, too. Not sure where they got fixed. Can you find the "fixing" commit with git bisect?

phord commented 7 years ago

I am seeing some trouble with movements running too far with one too many steps at the end. It didn't happen on master. I bisected it to this commit: 20a0808887074b08c4cff82b1ab2cc4bd8ced6e1. It improves with this commit: 1326db002f582c0ad67b58d8a8744a9387e2dbca, but it is still not perfect. Maybe this is the source of your problem, too.

You can see the problem with the simulator and config.default.h:

#include "config/board.gen7-v1.4.h"
#include "config/printer.mendel.h"

Here's how I tested it:

USER_CONFIG=config.default.h make sim && ./sim -o -t0 testcases/triangle.gcode && tail datalog.out

The final line should have zeroes in fields 2, 3 and 4. These columns show the end position of each axis in steps. In the "problem" commits, the Y axis steps to -1 after the five moves.

Here's the bad result. Notice the -1 for the Y position.

14386215 0 -1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1

Wurstnase commented 7 years ago

Maybe there is also an issue with the simulator itself. I just remember, that my testcode, the swan.gcode, have this issue also after ~15% of that code. Independent of this two commits.

phord commented 7 years ago

The div-zero problem happens because the simulator will catch it while the embedded processor will not. It usually represents a real bug since the error is usually not specific to the simulator. You can catch it in gdb with this:

gdb --args ./sim <simulator args>
(gdb)  run

Wurstnase commented 7 years ago

You can catch it in gdb

:+1:

rogram received signal SIGFPE, Arithmetic exception.
int_inv_sqrt (a=a@entry=0) at dda_maths.c:218
218   uint32_t q = ((uint32_t)(0xFFFFU / a)) << 8;
(gdb)

Wurstnase commented 7 years ago

Hmmm... looks like dda->n becomes 2**32-1.

phord commented 7 years ago

That SIGFPE is because something called int_inv_sqrt(0), and that function doesn't handle zero. That's because 1/sqrt(0) is infinity. But the code that calls this already checks for zero and handles it specially. Is the trouble that dda->n has counted more than 4 billion steps in a single movement? That seems unlikely, because other variables would have some trouble with such a movement, too.

phord commented 7 years ago

If you use my accel-integral branch, the int_inv_sqrt function is not called at all on the ACCELERATION_RAMPING builds. But then it does not handle lookahead yet, so, maybe it's not an option yet.

Wurstnase commented 7 years ago

The function is uint16_t int_inv_sqrt(uint16_t a). So 65536 becomes 0. But this shouldn't happen in that first part of code. I've run only the first 10 moves of the swan (80steps/mm). dda->n should be below 16bit at any time.

Wurstnase commented 7 years ago

Ok. I think I found a main-issues while testing. In one muldiv the divisor can become zero, when distance == 1 and delta_um[E] == 0.

Program received signal SIGFPE, Arithmetic exception.
0x0000555555556aec in muldiv (divisor=0, multiplier=0, multiplicand=1)
    at dda_maths.h:18
18    return muldivQR(multiplicand, multiplier / divisor,
(gdb) bt
#0  0x0000555555556aec in muldiv (divisor=0, multiplier=0, multiplicand=1)
    at dda_maths.h:18
#1  dda_create (dda=0x5555557614d8 <movebuffer+696>, 
    target=0x555555761714 <next_target+20>) at dda.c:433
#2  0x0000555555555706 in enqueue_home (t=0x555555761714 <next_target+20>, 
    endstop_check=0 '\000', endstop_stop_cond=0 '\000') at dda_queue.c:124
#3  0x0000555555559eba in enqueue (t=0x555555761714 <next_target+20>)
    at dda_queue.h:36
#4  process_gcode_command () at gcode_process.c:124
#5  0x0000555555558f79 in gcode_parse_char (c=10 '\n') at gcode_parse.c:387
#6  0x00005555555577ae in main (argc=4, argv=0x7fffffffdf88) at mendel.c:170
(gdb) frame 1
#1  dda_create (dda=0x5555557614d8 <movebuffer+696>, 
    target=0x555555761714 <next_target+20>) at dda.c:433
433         acc_ramp_len(muldiv(dda->fast_um, dda->endpoint.F, distance),
(gdb) p distance
$1 = 0

The other issue with the int_inv_sqrt currently not tested again. I changed the config from wolfstrap to mendel90 and I guess it's the huge steps/m-ratings.

Wurstnase commented 7 years ago

First picture is the current experimental. Any step is counted at xyze_step() and this picture is the difference between M114 and the counted steps. swan-diff-t4 The second picture is the current experimental with a manual revert of https://github.com/Traumflug/Teacup_Firmware/commit/20a0808887074 swan-diff-t5

Wurstnase commented 7 years ago

This is my testing condition: ~~https://github.com/Traumflug/Teacup_Firmware/commit/51090b126e580247~~ Hopefully it is complete. Put some things on my local gitignore.

Wurstnase commented 7 years ago

mendel90 and the swan are not compatible. (-100/100 on x-axis).

This are my current results. Code will follow soon. First is prefix. Second one with just a revert of how to count the steps in dda_step(). Be aware of the different axis-scales. swan-diff-prefix swan-diff-h3

Wurstnase commented 7 years ago

Funny things you can do with meshlab: prefix fix both

phord commented 7 years ago

Nice work. I was always hopeful the simulator could be used this way. I think gnuplot will show a similar rendering.

We should get some testcases in place to detect future regressions.

Wurstnase commented 7 years ago

Uploaded a much cleaner work of this. ~~DDA:testing steps~~ needs some rework for regression tests. ~~Update: https://github.com/Traumflug/Teacup_Firmware/commit/cf2ed5b642026~~ Update2: dda_queue_fix

I've ploted this first with gnuplot. But this has no renderings or shaded dots and looks not that nice. But you could use the same file also for this.

zungmann commented 7 years ago

I test this fix and now give a perfect result. So, good job and thanks.

Wurstnase commented 7 years ago

Thanks for testing.

Wurstnase commented 7 years ago

The test move_state.step_no >= dda->total_steps was a good idea. I've just brought this back. Save again around 35 clock cycles.

Wurstnase commented 7 years ago

Thanks @zungmann for a lot of testings. We found some issues in the current experimental. I made a clean rebase with more comment of the dda_queue_fix/2-branch and pushed this to experimental.

This issue should be solved now.

Traumflug / Teacup_Firmware

new dda_queue issues #260