Teacup and SimulAVR - Githubissues

Traumflug commented 11 years ago

I have Teacup running in SimulAVR. Took me quite some hacking on "foreign" code, 40598, 40638, 40594 and a few others, but I'm there, steppers run (invisibly):

src/simulavr -v -f ../Teacup_Firmware/build/teacup.elf
MESSAGE File to load: ../Teacup_Firmware/build/teacup.elf
MESSAGE Device name is atmega644
MESSAGE Connecting pin D1 as serial out to file - at 19200 baud.
MESSAGE Connecting file - as serial in to pin D0 at 19200 baud.
MESSAGE Running with CPU frequency: 20.000 MHz (20000000 Hz)
start
ok
M114
ok X:0.000,Y:0.000,Z:0.000,E:0.000,F:0
G1 X20 F200
ok 
M114
ok X:3.208,Y:0.000,Z:0.000,E:0.000,F:200
M114
ok X:13.069,Y:0.000,Z:0.000,E:0.000,F:200
M114
ok X:20.000,Y:0.000,Z:0.000,E:0.000,F:200
^C
SystemClock::Endless stopped
number of cpu cycles simulated: 372129894

Now it's almost trivial to attach with a debugger: http://reprap.org/wiki/SimulAVR#Attaching_a_debugger

Next step on the plan is to make pin signals visible, of course. Support is there, just very well hidden (they expect to write a Python wrapper application just to access the modules).

Traumflug commented 11 years ago

I've got it. Doing a G1 X20 Y10 F400 and watching Xdir, Xstep, Ydir and Ystep gives a trace like this:

g1 x20 y10 f400

Yikes! You can nicely see how the Bresenham algorithm works: Y does a step on every second X step.

Now there's a chance to find out why there are sometimes dropouts when running lookahead-code. Either by signal tracing or by running in the debugger.

P.S.: Instructions on how to do this are on the same wiki page.

Traumflug commented 11 years ago

@phord, is there a chance to merge these recorded VCD files with your plotting scripts? The format is pretty simple, a time stamp in one line, the value change in the next line:

Here b0 means "change to binary 0" and b1 means "change to binary 0". The number behind it is the number of the signal (can be a character, too). The numbers behind the # are the time stamps, in nanoseconds. This way even speed and acceleration could be calculated. Here's a brief description and a link to a more detailed description: https://en.wikipedia.org/wiki/Value_Change_Dump

phord commented 11 years ago

@traumflug Nice job! I feel my re-imported simulator becoming less relevant already. :-)

I don't have an easy way to convert VCD to my gnuplot "all-values, multi-columns" format. But a simpler option would be to convert the VCD file to several single-pin files. So, let's say you have foo.vcd like your example:

Reformat this slightly and split it into two files named foo.b0 and foo.b1.

foo.b0:

851297500    b0 1
856887400    b0 1

foo.b1:

851288850    b1 1
856879000    b1 1

My gnuplot script could work with this. Just change the "plot" command to match the separate filenames and the data is in column $3 for every file:

plot 'foo.b0' u ($1/10000000):( $3*0.8 + 9)  with steps t 'b0-name' , \
     'foo.b1' u ($1/10000000):( $3*0.8 + 8)  with steps t 'b1-name'

Also, change the names to match whatever b0, b1, etc. really are.

phord commented 11 years ago

Also, gtkwave can read and show VCD files natively. I don't grok its user interface, but you might give it a shot.

Cyberwizzard commented 11 years ago

@Traumflug are those rendering glitches in that VCD trace where there is a thicker vertical bar or multiple pulses?

Traumflug commented 11 years ago

are those rendering glitches in that VCD trace where there is a thicker vertical bar or multiple pulses?

Yes, these are just glitches. If you zoom further in, they resolve nicely. This picture zooms into such a pulse extremely:

single step

You can see clearly how the Y step is done after the X step due to the calculations in between. Even the un-steps come 2 CPU cycles apart.

Traumflug commented 11 years ago

Also, gtkwave can read and show VCD files natively.

Yes, it can. The pictures are done with it. But AFAIK there's no facility to calculate derivates like speed or to sum up the steps into a position.

phord commented 11 years ago

On Nov 21, 2013 7:01 AM, "Traumflug" notifications@github.com wrote:

Also, gtkwave can read and show VCD files natively.

Yes, it can. The pictures are done with it. But AFAIK there's no facility to calculate derivates like speed or to sum up the steps into a position.

I wasn't able to make gnuplot calculate derivatives either. I tried to calculate the acceleration in the code during simulation, but these instantaneous calculations were inconsistent. Python might be your friend.

Traumflug commented 10 years ago

I've just completed (hopefully) the instructions on how to work with the simulator: http://reprap.org/wiki/SimulAVR . It's easy. Prepared Teacup sources are on the simavr branch, I'll pick them over soon.

Now, I guess, I have to mill a few PCBs, else my customers will get nervous ;-)

Traumflug commented 10 years ago

Just picked the simavr branch over onto the experimental branch. Branch simavr is obsolete and gone.

Traumflug commented 10 years ago

I've added a rough script for running testcases in SimulAVR onto the experimental (and cross) branch. It's in testcases/. So far is extracts (using, uhm, good old awk) position and single axis speed at each time. Unless I messed up or SimulAVR doesn't work as expected, the results are, well, interesting (red is position, green is X feedrate, blue is Y feedrate):

triangle

Of course, X and Y axis should be scaled evenly and feedrates should have their own scaling. Moves along X shouldn't be overdrawn by the axis system. Also feedrates are dependent on the X position, not on time. Undoubtly there are better ways to display such data, but it's a start.

Very well visible is how stepping gets pretty jaggy above 10'000 steps/s.

Oh, and prepare for some patience. Running a 60 second simulation with thousands of pin changes takes something like 10 minutes on a commodity PC.

Running the script is simple:

cd testcases
./run-in-simulavr.sh

... and all the files, including the PNGs, will appear. If you're just after improving the postprocessing, I'd comment out the simulavr step in the script after the first run.

phord commented 10 years ago

I wonder if it really gets that jaggy, though. I tried to calculate feedrate and acceleration at run-time in the simulator and I got similar results. But I chalked them up to signal noise. On the other hand, I was unable to get rid of them with 5-sample averaging. Maybe it's really jaggy.

I have a script that plots x/y coordinates for the datalog, too. It looks like this:

gnuplot --persist -e "plot 'datalog.out' u 2:3 with points"

I added it to the comments at the top of the datalog trace files recently (not pushed). Another version will show a 3D version with time as the Z-axis. That can be pretty informative.

I've been testing like this:

./sim testcases/smooth-curves.gcode -pg -oramping.trace

For some reason, though, RAMPING takes 15 times longer to simulate than TEMPORAL. The simulated time is not much different (90 seconds vs 62 seconds), but the actual run-time is hugely different (1000 seconds vs. 63 seconds). It pegs my CPU to 100%, too. :-\

Traumflug commented 10 years ago

But I chalked them up to signal noise.

"Noise"? Uhm, delays are a result of some digital calculation, not of an analog measuring probe.

After sleeping over it, I think the speed calculation converter should also create a secondary VCD file with speeds in it. GTKWave can view such stuff as analog signal. Then one can zoom in. The tricky part here is to get events sorted.

Please don't forget, all this displaying stuff isn't for enjoying graphics, but to find out why lookahead still doesn't work. No advance on this front despite countless hours of work.

phord commented 10 years ago

But I chalked them up to signal noise. "Noise"? Uhm, delays are a result of some digital calculation, not of an analog measuring probe.

Not electrical noise, but the interference caused by running the simulator on a multi-threading linux computer, inaccuracies in reading timers, etc. The noise in your plot looks very regular like something specific messing with it. If it's in the Teacup code, it looks squashable. But it might just be rounding "errors" from the integer division.

I'm not sure what the all the feedrate lines are in your plot, though.

After sleeping over it, I think the speed calculation converter should also create a secondary VCD file with speeds in it. GTKWave can view such stuff as analog signal. Then one can zoom in. The tricky part here is to get events sorted.

Yeah, I do this in data_logger.c in the simulator, but I don't have it in VCD format. I don't think you need a secondary VCD file, though. You should be able to track that analog info in the same VCD file where you track the binary signals. I'm not sure of the format, but I believe it's the same as if you separate it into a different file.

Please don't forget, all this displaying stuff isn't for enjoying graphics, but to find out why lookahead still doesn't work. No advance on this front despite countless hours of work.

I'm starting to think that single-axis lookahead is not entirely feasible. I think you know this already and have alluded to it, but I'm slow to catch on. I think I see some novel solutions to the problem in the exponential curve solution I'm investigating, though it's not directly related to the velocity curve.

Traumflug commented 10 years ago

The noise in your plot looks very regular like something specific messing with it.

Yes. For example the clock and analog input (temperature) interrupts. Also quite a number of atomic operations, like the one in setTimer().

I'm not sure what the all the feedrate lines are in your plot, though.

It's speed based on position. If an axis moves back and forth, you get two feedrate lines on top of each other. The exercised G-code is triangle.gcode in testcases/ All the stuff required to reproduce this is in the repo. Steps/mm is 1280 (M8-driven axes). Example: G1 X20 means a move to the position of 25600 steps, exactly where one of the trapezoids comes down.

Only feedrate scaling is quantitatively wrong. For some unknown reason time scaling in the VCD file is closer to an ATmega running at 4 MHz than to one running as 20 MHz.

I'm starting to think that single-axis lookahead is not entirely feasible.

It is. Marlin demonstrates this. Teacups' theory is fine. Teacups' code has a bug. Running buggy software can't prove a theory to be wrong, so making a headache about additional theories can't solve the current state of misbehaviour. :-)

One of my suspects is a sequence of straight moves. Like G1 X5; G1 X10; G1 X15; G1 X20. While they can undoubtly be moved at full speed, I could swear I hear short dropouts at the joints.

But one can't reproduce or communicate listening experiences, hence all this simulator stuff.

phord commented 10 years ago

I'm starting to think that single-axis lookahead is not entirely feasible.

It is. Marlin demonstrates this. Teacups' theory is fine. Teacups' code has a bug. Running buggy software can't prove a theory to be wrong

It may seem to you that I'm coming to this belief by looking at your lookahead code as you are doing, but this is not the case. And you may have misinterpreted my statement as something stronger than what I meant. That's my fault. Let me revise it:

I'm starting to realize that my understanding of single-axis lookahead resolution was wrong. It generates exceedingly bad positions in most cases. Minding the direction and velocity delta of the other axis/axes can be used to limit the wrongness of the move merging on each axis.

so making a headache about additional theories can't solve the current state of misbehaviour. :-)

I stumbled onto this "additional theory" somewhat accidentally. I wasn't trying to solve the current misbehavior, though I was trying to understand the math of the end-goal.

It's also likely that I'm very wrong on this. Either way, I'm not trying to discourage you. Carry forward, brother. I'm looking at improving the PC-based simulator to try to help the analysis. But I'm short on time this week, so I'll be a little slow.

Cyberwizzard commented 10 years ago

The PC based simulation is a good thing to have in general; when I was working on the initial version of the lookahead I surmised that it was possible to multiple runs over the movement queue in order to obtain higher speeds where possible. The problem I faced was that since I had no previous experience with ATmegas at this level, I can not estimate if the CPU has the time for such extra work.

If we can use a cycle accurate simulation, it helps with debugging these issues but can also point to performance bottlenecks. A great example is the code reduction with loops which reduced the binary size but might lower performance as well (as was observed in the issue thread on Github).

I am currently pressed for time but I hope I can help out in the near future with this as well.

On a side note: the new G codes that should control lookahead and path interpolation look interesting. However I would not use path interpolation on my printer as it will change the object that is being printed (minor deformations originating from our current lookahead aside). But if this is configurable behaviour then I'm just quite curious if our 8-bit microcontroller can actually do this in real time.

Traumflug commented 10 years ago

I'm starting to realize that my understanding of single-axis lookahead resolution was wrong. It generates exceedingly bad positions in most cases.

Positions are precisely the same, with or without the current version of lookahead. Jerk based lookahead is all about speeds, the algorithms doing the positional maths aren't touched.

However I would not use path interpolation on my printer as it will change the object that is being printed

The same would be true for jerk (and the approach to split curves into straight segments, to start with). Jerk means: let the printer bend instead of doing a controlled curvature.

But if this is configurable behaviour then I'm just quite curious if our 8-bit microcontroller can actually do this in real time.

It can. The question is: up to which feedrates?

That said, Teacup is ported to ARM already, so you can have a faster CPU, too.

phord commented 10 years ago

Positions are precisely the same, with or without the current version of lookahead. Jerk based lookahead is all about speeds, the algorithms doing the positional maths aren't touched.

If you keep the same positional data and only vary the speed, then you're back to needing a dead stop any time the other axis changes direction in order to match that axis, right?

Traumflug commented 10 years ago

If you keep the same positional data and only vary the speed, then you're back to needing a dead stop any time the other axis changes direction in order to match that axis, right?

In a physically sane manner: yes. In terms of jerk-based lookahead: no. Teacup features jerk-based lookahead.

Looks like I get used to scripting VCD files. Nice acceleration ramps, now speed over time: velocities velocities close

As you can see, even a feedrate in mm/min is given if you just zoom in close enough. As you can also see: lookahead has zero effect, despite G-code sent as fast as possible and everything happening on the same axis. I guess I'm getting closer to the problem, now I can communicate it. All code on the experimental branch, the G-code run in the above picture is "straight-speeds.gcode".

Regarding viewing analog data in GTKWave: To get such an analog view, drag the signal over into the Signal column, then

(important!) left click to select it
right click -> Data Format -> Analog -> Step
right click -> Data Format -> Analog -> Resizing -> All Data
right click -> Insert Analog Height Extension.

phord commented 10 years ago

The top waveform is acceleration, right?

phord commented 10 years ago

@Traumflug Try the simulator here (now on experimental): fe8854aa91d4faf477a921189c9ac6c4121fdf4e

make -f Makefile-SIM && ./sim testcases/smooth-curves.gcode -p -otime0.trace -g -t0

"-t0" at the end turns off the real-time timers and runs in warp-speed mode.
-p shows positions on the console as it updates
-g shows the gcode on the console
-o generates the trace file, in this case named 'time0.trace'

Now try this command:

gnuplot --persist -e "set pointsize 0.01; plot 'time0.trace' u 2:3:1 with points"

It's easy to overlay two different plots to eyeball the differences. But it's not that easy to view when they are close but different.

screenshot from 2013-11-26 22 21 57

The timer seems pretty accurate, but the whole thing is idealized. Look at the end of the trace file and you'll see it finishes around 63 seconds (TEMPORAL) or 108 seconds (RAMPING w/ 1000). But real-world effects such as interrupts and code delays are completely erased. This is just an idealized plot of DDA. But it runs in under 10 seconds, and it's nice for comparing the effect of DDA changes.

It helped me find two bugs in my looping code.

Traumflug commented 10 years ago

The top waveform is acceleration, right?

X_steps are the single steps. X_steps/s are velocity in steps/s, X_mm/min is also velocity, but in mm/min (it extracts STEPS_PER_M_X from config.h.

Traumflug commented 10 years ago

The timer seems pretty accurate, but the whole thing is idealized.

Thanks for the instructions on how to use the host side simulator. 63 seconds matches real world pretty exactly; TEMPORAL is just a draft, completely unverified code.

Now, is it possible to show speed/acceleration somehow? What do they show with straight-speeds.gcode?

phord commented 10 years ago

I think I'm doing something wrong, but I don't know what yet.

straight-speeds takes 12 virtual seconds, regardless of ACCELERATION (10 or 1000). That seems odd.

The plot looks ok. screenshot from 2013-11-27 18 43 33

But when I try to plot the velocity, it looks like I get acceleration instead. Here's dx/dt:

screenshot from 2013-11-27 18 41 26

Here's dy/dt:

screenshot from 2013-11-27 18 40 50

I'm contorting gnuplot into doing this. Here's what I've done:

 ./sim testcases/straight-speeds.gcode --tracefile=straight.trace --time-scale=0 --gcode

 # Isolate traces showing changing 'X' values and the time it changed
 cut "-d " -f1,2 straight.trace| uniq -f1 > straight.x

 # Isolate traces showing changing 'Y' values and the time it changed
 cut "-d " -f1,3 straight.trace| uniq -f1 > straight.y

gnuplot
d2(x,y) = ($0 == 0) ? (x1 = x, y1 = y, 1/0) : (x2 = x1, x1 = x, y2 = y1, y1 = y, (y1-y2)/(x1-x2))
gnuplot> plot  'straight.trace' u 2:3 with lines t 'position'
gnuplot> plot 'straight.x' u ($1/1000000000.0):(d2($1/1000000000.0,1.0*$2)) with lines t 'velocity', '' u ($1/1000000000.0):($2)  with lines t 'position'
gnuplot> plot 'straight.y' u ($1/1000000000.0):(d2($1/1000000000.0,1.0*$2)) with lines t 'velocity', '' u ($1/1000000000.0):($2)  with lines t 'position'

phord commented 10 years ago

Sorry, I said that wrong. What i meant was I am expecting to see ramping acceleration and curvy position, but I am seeing linear position changes instead. Acceleration is constant, as graphed.

But I don't know why.

Traumflug commented 10 years ago

As you can see in #66, I tried to get a data file to experiment myself, ...

... without this: apparently acceleration got lost for you. Is it possible you use ACCELERATION_TEMPORAL? Despite it's name, this variant has no acceleration, yet.

phord commented 10 years ago

It was a combination of things. I switched to the default config for testing, but the MAX_FEEDRATEs are too low to see much acceleration. Once I bumped those way up, I had to drop acceleration way down also to see each movement affected.

gnuplot> set terminal png
gnuplot> set title 'Plot'
gnuplot> set output '/tmp/plot.png'
gnuplot> plot  [-1000:35000] 'straight.trace' u 2:3 t 'position' with lines

plot

gnuplot> set title 'X-axis'
gnuplot> set output '/tmp/x-axis.png'
gnuplot> plot 'straight.x' u ($1/1000000.0):(d2($1/1000000.0,$2)) with lines t 'velocity', '' u ($1/1000000.0):($2)  with lines t 'position'

x-axis

gnuplot> set output '/tmp/y-axis.png'
gnuplot> set title 'Y-axis'
gnuplot> plot 'straight.y' u ($1/1000000.0):(d2($1/1000000.0,$2)) with lines t 'velocity', '' u ($1/1000000.0):($2)  with lines t 'position'

y-axis

Traumflug commented 10 years ago

I've just rebased my SimulAVR fork here on Github to the latest SimulAVR master. Appears to work fine and should work now on OS X, too.

Traumflug commented 10 years ago

Just added instructions on how to precisely measure code performance: http://reprap.org/wiki/Teacup_Firmware#Doing_precision_timing_measurements

bildschirmfoto vom 2014-06-15 23 06 44

This should help a lot to decide wether a code change is worth it's CPU time or not. I felt like needing this because these looping changes also change the code in dda_step(), which is very time critical.

Preliminary result: we're pretty good with 306 to 328 clock cycles/interrupt, which would theoretically allow 20'000'000 MHz / 328 = 60.975 kHz step rate, but when forwarding from one movement to another, dda_start() apparently bogs this down to over 700 clock ticks. So our maximum corner speed would be about 27.8 kHz. Not bad either.

$ ./run-in-simulavr.sh short-moves.gcode
Assuming pin configuration for a Gen7-v1.4 + debug LED on DIO21.
[...]
Statistics (assuming a 20 MHz clock): 
LED on occurences: 838.
Sum of all LED on time: 262055 clock cycles.
LED on time minimum: 306 clock cycles.
LED on time maximum: 717 clock cycles.
LED on time average: 312.715 clock cycles.

P.S.: code is on my local experimental branch already, but some parts untested, so it'll take a day or two until it appears in the repo.

Traumflug / Teacup_Firmware

Teacup and SimulAVR #62