Delta Printer Support - Githubissues

NickE37 commented 9 years ago

I have been familiarizing myself with the source code so that I can add Delta Printer support to Teacup.

The biggest issue I see is with micrometer scales and 32 bit integers. The conversion of cartesian coordinates to tower movement is (for each tower):

delta_tow_a_pos=sqrt(arm_len^2 - (tow_a_loc_x - cart_x)^2 -(tow_a_loc_y - cart_y)^2) + cart_z delta_tow_b_pos=sqrt(arm_len^2 - (tow_a_loc_b - cart_x)^2 -(tow_a_loc_b - cart_y)^2) + cart_z delta_tow_c_pos=sqrt(arm_len^2 - (tow_a_loc_c - cart_x)^2 -(tow_a_loc_c - cart_y)^2) + cart_z

Typical values for delta printers overflow 32 bit integers when squared.

For example, a typical arm_len is between 170mm to 300mm (170000um to 300000um). Squared that is 28,900,000,000 to 90,000,000,000, where 2^32 is 4,294,967,296. [I apologize for being pedantic, I am trying to document the problem here, not insult you.]

For Teacup, what is the best way to handle this problem?

64 bit integers for the calculations then back to 32 bit?
Floats for the calculations then back to 32 bit?
Scaling the numbers down for the calculation (a factor of 10?), then back to um to return the results?
Something else?

This problem is also being massively compounded by having the typical cartestian move chopped into small segments.

Your guidance would be appreciated.

NickE

Traumflug commented 9 years ago

I apologize for being pedantic

Being pedantic is the best way to write good code and very welcome! Digital CPUs are pedantic, too, after all.

A few bits:

64 bit integers are certainly better than floats, because floats are 23 bit only.
If scaling is possible, that's usually a good way. Not neccessarily by 10; often 8 or 16 is better, because this can be done by bit-shifting (~100 times faster than an arbitrary division).
Current assumption is largest X, Y, Z to be +-1000mm (1'000'000 um) and largest steps/mm to be 40'960. Ideally it fits into this window, but limitations are possible, of course.
Math with constants doesn't matter, because this is done in the preprocessor. See preprocessor_math.c for an extremely large formula which melts down to just one 32-bit constant in the compiled binary.
Accordingly it's a good idea to try to separate constants from variables. Arm length and tower locations are constants.
Math can help tremendously. See for example the acceleration formula: https://github.com/Traumflug/Teacup_Firmware/blob/master/dda.c#L893 It's inprecise for n < 3, which doesn't matter at all.

NickE37 commented 9 years ago

I have made a pass at adding delta support to the Mega 2560/Ramps 1.4. I also did a preliminary homing routine. The delta kinematics almost move correctly, but there are precision problems on the order of tenths of millimeters in repeatability. I think (but I am not sure) that the precision problems are related to the scaling that is being used to keep inside the 32 bit limits. It is also possible that the repeatability is related to the homing routine. I really wanted to do a test print, but repeated calibration measurements at the same spot would vary +/- 0.3mm or so. I am continuing to investigate. Other approaches to delta motion and opportunities for optimization of the delta routines abound.

The segmentation of the moves into pieces makes it incompatible with the Ramping Acceleration regardless of the MOVEBUFFER_SIZE or ACCELERATION. The motion was stuttering between each delta segment the same as with no acceleration. Maybe I do not understand the Ramping Acceleration algorithm enough. The Reprap Acceleration does acceptably smooth the motion. Further experiment is warranted.

I am a complete novice at github, so I do not know the procedure for making this work visible to you, if you are interested. Should I fork your repository to my own, or do you make me a collaborator, email you a zip file, or something else?

Traumflug commented 9 years ago

Sounds great!

Should I fork your repository to my own, or do you make me a collaborator, email you a zip file, or something else?

I just made you a collaborator, welcome to Teacup development. So you can adjust your local repo now to be read/write. Simple rule is: create a new branch for your work. Ideally starting from the current experimental branch. And create as many branches as you feel neccessary, there's no shortage.

If you're not that experienced in Git, do the following in the old repo:

git checkout <your branch>
git format-patch --keep-subject <your branch>~20

This will create a patch series, 20 patches in this case. Then you can make another clone of the Github repo, which will be read/write by default now. Create your branch there as well, then apply the neccessary patches with "git am".

The segmentation of the moves into pieces makes it incompatible with the Ramping Acceleration regardless of the MOVEBUFFER_SIZE or ACCELERATION.

Default Jerk is just 20 mm/min, not very much. On belt driven axes, 200 or even 1000 might work well. ACCELERATION_REPRAP is an entire different strategy, basically F doesn't give the speed during movement, but at the end of the movement. With this strategy, G-code generators have to send three lines of G-code for each movement, one for acceleration, one for constant speed and one for deceleration. The idea behind that is to calculate lookahead on the host; a good idea, which nevertheless never got a foothold.

If your segments are very short you also might run into the fact that lookahead currently looks only one movement back. dda_join_moves() should be called repeatedly (here https://github.com/Traumflug/Teacup_Firmware/blob/master/dda.c#L419) stepping backwards through the queue until there's no more to optimise. dda_join_move() has already protection against not changing movements already being executed and also returns a flag wether stepping backwards should continue, so it's just a matter of doing these calls.

Regarding repeatability: scaling can expose inaccuracies, but usually these inaccuracies are always exactly the same for the same movements. Nice thing is, there's int_sqrt() in dda_math.c which should be expandable to 64 bits and @dc42's firmware also has a 64 bit square root here: https://github.com/dc42/RepRapFirmware/blob/dev/DriveMovement.cpp#L335

Currently I'm occupied with cleaning reprap.org's mailing lists (incredibly inefficient tools), I hope I can return to contributing here soon.

Traumflug commented 9 years ago

P.S.: if lookahead still doesn't work (BTW., did you turn on LOOKAHEAD in config.h?), first step would be to look at the calculated crossing speeds. See dda_find_crossing_speed() in dda_lookahead.c. Debug messages are already there, you just have to turn them on by commenting out the if() in front of them.

NickE37 commented 9 years ago

I had originally based my branch on the master branch, I so cloned the experimental branch and added my delta kinematics changes to the experimental branch before pushing it to the repository.

I located one (and hopefully only) source of repeatability issues. I was scaling delta geometry parameters before doing trig on them rather than after. The z-axis was losing 2 steps every 10mm of travel in the um_to_steps calculation. [Forehead slap!] I fixed that issue. So far, it does not look like 64 bit will be necessary.

I tried ACCELERATION_RAMPING (yes, I did have LOOKAHEAD enabled before and now) again. With MAX_JERK at 200, the motion was not stuttering, but not totally smooth. I will try higher values for jerk soon. Above you mentioned that "dda_join_moves() should be called repeatedly". Did you mean literally to put calls to dda_join_moves() into a loop <= to the number of moves in the queue?

I hope to calibrate and do a test print tonight.

Traumflug commented 9 years ago

Nice chunk of work!

Did you mean literally to put calls to dda_join_moves() into a loop equal to the number of moves in the queue?

Yes. Reasons to leave the loop include:

Crossing speed was zero.
Previous move wasn't actually a move (but a wait for temp or F only ("nullmove")).
Previous move ends at crossing speed.
Previous move is executed already (so it can't be changed any longer).

I just see this flag wether dda_join_moves() could do anything useful got lost at some point, so it needs to be added again. Most of these detections are present, but return without flag.

NickE37 commented 9 years ago

I have increased the x,y,z jerk to 400 and it seem much smoother.

I am still having a problem where it is loosing mechanical z-height with every move. For example: I home the printer (cartesian coordinates 0,0,200), then back and forth from cartesian (0,0,5) and (0,0,100) allowing the printer to finish each move before issuing the next. The effector will crash into the bed on the 5th or 6th cycle (sometimes the 4th). A similar experiment cycling moves between any two points will have the effector crashing after a number of cycles.

steps[] has the correct values at the end of each move even after hundreds of cycles, so I conclude that the math calculations are correct.

This is definitely not a mechanical issue, since Marlin does not exhibit this behavior.

What else can be causing this issue?

Could it be skipping/dropping steps?
Something in the control of the motors - enabling/disabling, poweron/poweroff? I noticed that the z-motor is enabled differently from x,y, and e. Could the carriages just be slipping down the towers between moves?
Something else?

I am baffled and any suggestions would be welcome.

Traumflug commented 9 years ago

Could it be skipping/dropping steps?

That's always worth a consideration, of course. There's a z_disable() in dda.c approx. line 725. I think you can safely remove this. Other than that steppers are only turned off at power down, some 30 seconds after the last move.

Also a likely cause for step losses is when steppers receive steps too fast in some situations. Like messed up acceleration calculation or something like this. Here is the simulator very handy. You have to reduce the testcase to some 15 lines of G-code and can't home, but you'd see the speed spikes corresponding to too fast stepping immediately. Instructions in the wiki, I hope they're sufficient: http://reprap.org/wiki/SimulAVR#Installation and http://reprap.org/wiki/Teacup_Firmware#Teacup_in_SimulAVR

NickE37 commented 9 years ago

I am in the process of setting up the simulator to try to see what is going on.

I do believe that the system might be choking on the huge number of moves generated by the delta segmentation method being used (essentially the same as Marlin's - cartesian space). That approach seems counter-intuitive since that would generate more segments at lower speeds.

I am exploring different strategies that might help. Example strategies include:

segmenting based on the longest axis move in delta space
segmenting based on maximum allowable cartesian z-error in xy cartesian moves
throttling velocity based on maximum step rates
multi-stepping

Quick question: In the discussion on the RepRap forum (http://forums.reprap.org/read.php?147,33082,469440#msg-469440), you state Teacup can achieve 48,000 steps/sec on 8-bit. That implies a step timing of about 21us/step maximum. Can that be achieved each for all four axes of movement simultaneously, or is that the combined axis total?

Traumflug commented 9 years ago

Not to turn down your great work, more to give you a bit an idea in which direction to change things when searching for new algorithms: ideally delta movements would get away without any segmenting. dc42 firmware does this, it re-adjusts speed of each axis on every step (and apparently brings even 32-bit CPUs to their limits this way). Instead of re-adjusting after every step it'd be better to do so in dda_clock(), which is executed every 2 milliseconds.

Total number of steps stay the same during a move even on a Delta. Accordingly you'd have to re-adjust these Bresenham deltas to give the current movement direction and after doing so speed of the fastest axis. One could even get away without Bresenham, like dc42. ACCELERATION_TEMPORAL is a try in this direction.

However, there's currently no prior art on this, so implementing a Marlin-like approach is certainly easier. And undoubtly having something working at all is better than super future code only half done.

throttling velocity based on maximum step rates

Maximum feedrate limitation works reliably in Teacup. It takes each axis individually into account.

multi-stepping

Likely pointless in Teacup. The additional code required likely costs more than what double-stepping could achieve. Quad-stepping speeds up perhaps a bit, but you also open a whole can of worms with such stuff regarding step accuracy. Like: what to do when you're in quadstep mode and there are only 3 steps to move left? Such situations need to be handled, likely resulting in even more and slower code.

you state Teacup can achieve 48,000 steps/sec on 8-bit. That implies a step timing of about 21us/step maximum.

That was measured when moving a single axis on physical 20 MHz hardware. So yes, it hints to 416 CPU clock cycles neccessary per step. Running in the simulator gives about 310 CPU clocks in this situation. Not counting the overhead of the interrupt its self (It's the distance between DEBUG_LED_ON() and DEBUG_LED_OFF()). Running more than one axis in the simulator makes the step interrupt going up to about 400 clocks. Summing all this up you end somewhere at 30'000 steps/s on 16 MHz hardware moving all axes.

A step interrupt has 3 parts using noticeable time: Bresenham (early in dda_step()), check wether the move is done (comparing four 32-bit integers to zero costs a surprising amount of cycles) and setTimer(), which sets a 32-bit timer. ATmegas have no 32-bit timers? Yes, that's right. That's why quite some code was neccessary to act as a 32-bit timer while actually being a 16-bit timer. See timer.c.

Traumflug commented 9 years ago

Today on IRC, #reprap-admin:

[00:06] <thejollygrimreap> #ifdef lcd [00:06] <thejollygrimreap> in mendel.c [00:08] <thejollygrimreap> LCD isn't defined anywhere as far as i can tell [00:08] <thejollygrimreap> yay!! [00:08] <thejollygrimreap> it works

:-)

Thanks, @NickE37.

NickE37 commented 9 years ago

The #define LCD is in config.ramps-v1.3.h (line 77) only since it is for the Discount Smart Controller. I haven't propagated through the various config files. Further down, the pins are defined.

jgrjgr commented 9 years ago

ah, that would be why i couldn't find it in the rumba one, :)

jgrjgr commented 9 years ago

@NickE37 have you got any plans for the menu system for the lcd ?

NickE37 commented 9 years ago

Eventually, a full menu system will need to be added. Getting Delta support to the point it is actually printing reliably is my first priority. I just threw the basics onto the LCD because it is an aid in manual calibration to not have to constantly do M114. I am really surprised noone has done the LCD and menu already.

Traumflug commented 9 years ago

Put months into getting an algorithm step-accurate and people will barely notice. Put an evening into whipping up some display code and people will get crazy. :-)

Back to topic, @ambrop7 talks about how he needs only a 26-bit square root, I assume for the delta algorithms: http://forums.reprap.org/read.php?147,33082,470934#msg-470934 He also shows C++11 code, which is hard to decipher for me.

NickE37 commented 9 years ago

I am interested in the approach that dc42 and ambrop7 have taken, but I do not think it will ever be feasible in Teacup without 32 bit hardware. It might not be necessary...

I have been doing a deep review of the mathematics of cartesian to delta and delta to cartesian coordinate space. Marlin and Repetier do segmentation and dc42 and ambrop7 calculate step-accurate timing because if you simply transform a cartesian move to delta coordinates and then move linearly (constant timing between steps from delta start to delta end) in delta space, you will have positional errors in cartesian space along the path. In other words, the transformation of cartesian to delta space is not linear.

It turns out the error function looks remarkably similar to a ballastics with drag trajectory. If I can match the parameters of the error function to known quantities, I should be able to write a "correction" function that is very similar to an acceleration function, completely removing the need for segmentation or per step calculations. Anyway, slogging through the math now.

Traumflug commented 9 years ago

I am interested in the approach that dc42 and ambrop7 have taken, but I do not think it will ever be feasible in Teacup without 32 bit hardware.

There's not much stopping you from running Teacup on 32-bit hardware. It's a matter of getting it built at all, because ARM lacks quite a number of the nice things AVR has, e.g. avr-libc and the -mmcu flag for gcc. As with many things, there's already some work available, notably the teensy3.1 and gen7-arm branches. teensy3.1 is considered to work, gen7-arm is more a proof of concept. I consider the approach taken on the gen7-arm branch to be the best to support a wide variety of ARM CPUs.

In other words, the transformation of cartesian to delta space is not linear.

That's true. Instead of doing these calculations for every step, they could be done in dda_clock(), saving some 99% of the neccessary work at the risk of missing the target by a step or two (which shouldn't happen). The code wrapped in ACCELERATION_TEMPORAL means to calculate step timings independently for each stepper pretty much the same way dc42 does. Not using this, one has to deal with the Bresenham counters.

NickE37 commented 9 years ago

The latest commit is a working version.

I posted here: http://forums.reprap.org/read.php?147,33082,471483#msg-471483 to announce the news.

The current segmentation strategy is to segment based on Cartesian distance rather than time. I am still studying dc42's and ambrop7's approaches to see if it is feasible or even really beneficial except in an academic way.

triffid commented 9 years ago

If you have fixed-distance segments, then max speed is limited by the length of the lookahead queue.

The queue must ramp to zero at the end, and with a limited number of short segments there will be some speed at which it takes the entire queue to decelerate. It is not possible to exceed this speed.

If instead you segment based on time, then segments get longer at higher speeds.

This will cause the nozzle to dip into the Z-plane during each segment, however Z-lift steps around that issue nicely and avoids print issues

NickE37 commented 9 years ago

I agree that the segmentation based on time approach is limited. It was put in place to make debugging easier for the time being. I am investigating other strategies that do not rely on segmentation at all.

NickE37 commented 9 years ago

I pushed out a commit that changes the segmentation strategy back to time based segmentation per triffid's remarks above.

Traumflug commented 9 years ago

Not that I exactly agree with triffid, but let's decide that later.

NickE37 commented 9 years ago

@Traumflug, I think that triffid is correct if one is actually stepping through the lookahead queue. I believe you said that Teacup is currently only looking at the previous move, so his comment may not apply.

I have been examining the lookahead functions. Your earlier comment (and comments in the code) said that one should step back through the queue calling dda_join_moves. How would one set up such a loop? What is the best index to use?

Would the following pseudocode work:

for (i=mb_tail; i < mb_head; i--){ dda_curr = &movebuffer[i]; dda_prev = &movebuffer[i-1]; dda_join_moves(dda_prev,dda_curr); if (exit_flags) break; }

I know the mb_head and mb_tail can wrap, but I am simplifying.

Thank you!

Traumflug commented 9 years ago

I think that triffid is correct if one is actually stepping through the lookahead queue.

Assuming an acceleration of just 1000 mm/s2 and 64 segments of 1 mm in the queue, one can reach sqrt(2 * 1000 * 32) = 252 mm/s. With a more often seen acceleration of 5000 mm/s2, this is 565 mm/s. Accordingly I consider this not to be a real limitation.

Another point is, if one can deal with the geometrical inaccuarcies of larger segments, what's the point of making them shorter at lower speeds?

Third point is, I value geometrical accuracy over maximum speed.

But as said, this can be decided later (after running tests) and avoiding segmentation at all is even much better, of course.

Would the following pseudocode work: [...]

Yes, I think so. In other firmwares you'll find statements that one needs to hop in both directions through the queue to allow maximum feedrates. I think one can get away with only a reverse pass if this pass is repeated for every single segment added to the queue.

NickE37 commented 9 years ago

I have added the ability to more easily switch between time based and distance based segmentation to facilitate testing.

Two thoughts:

One is always going to be segmenting with Cartesian to Delta transformation - even if the segmentation is so small it is at the step level. dc42 is segmenting to the 1us level, even if the semantics are different. Or, do I misunderstand this?

Acceleration and the joining of moves could be done as part of the segmentation process. The process currently:

Cartesian move arrives and is diverted from enqueue
Cartesian move is chopped into segments, each of which is enqueued
An enqueued move passes through the normal create move (Cartesian) process that "unjoins" the segments before acceleration (and lookahead joining and ramping) is added.

Do you have any thoughts about the best way to make this more efficient?

Pass enough previous dda info to the segmentation routine so that each segment can be pre-populated with acceleration/lookahead values?

Traumflug commented 9 years ago

I have added the ability to more easily switch between time based and distance based segmentation to facilitate testing.

Excellent idea!

dc42 is segmenting to the 1us level, even if the semantics are different.

I think he doesn't segment at all, the carthesian -> delta transformation is, just like acceleration calculation, part of his step delta time calculation so it's done for every single step of each axis. It's in DriveMovement::CalcNextStepTimeDelta() in DriveMovement.cpp. This "1 us" is about the accuracy he achieves with this calculation.

Pass enough previous dda info to the segmentation routine so that each segment can be pre-populated with acceleration/lookahead values?

The worst thing about this segmenting is, the knowledge that two adjectant segments can be moved full speed and need no acceleration/deceleration ramps in between is thrown away, just to re-figure that fact with pretty time consuming maths a moment later.

So yes, passing a flag to dda_create() that a segment is an intermediate segment would make calling dda_find_crossing_speed() unneccessary, because crossing speed == feedrate in all these cases. Regarding ac-/deceleration ramp calulation it's more complex, because ramps stretch across several segments.

I still believe recalculating movement direction continuously in dda_clock() instead of segmenting is the most efficient way to go. Right before or after acceleration calculation. One has the momentary position there and it should be possible to calculate the momentary movement direction from that.

Which of both is actually, not only believed to be, more efficient, can only be seen by implementing and measuring them, I fear.

Traumflug commented 9 years ago

P.S.: it might even be possible to do the segmenting after running dda_create(). Then you already have all the ramping numbers and can pass them to all the segments as appropriate.

dc42 commented 9 years ago

dc42 is segmenting to the 1us level, even if the semantics are

different.

I think he doesn't segment at all, the carthesian -> delta transformation is, just like acceleration calculation, part of his step delta time calculation so it's done for every single step of each axis. It's in DriveMovement::CalcNextStepTimeDelta() in DriveMovement.cpp. This "1 us" is about the accuracy he achieves with this calculation. <<

That's correct, I don't segment at all. The lookahead algorithm works on moves as received from the gcode stream. The step generation algorithm works like this. For each motor, from the distance per step and the delta kinematics, I calculate and store when the next step is due for that motor, taking account of whether the current linear move is accelerating, decelerating, or in a steady speed phase. Then an interrupt is scheduled for the time when the earliest step is due for any motor. Whenever a motor is stepped, the time when its next step is due is calculated and stored - and so the cycle repeats.

Doing these calculations for each step is of course computationally expensive, but not too expensive on a 32-bit processor. The time quantum I use is 0.38us because I run the interrupt timer at the processor clock rate divided by 256; but there are some rounding errors that accumulate during the steady state phase, due mainly to the use of integer maths. So the actual calculation accuracy is more like 1us, or sometimes a little worse towards the end of a long move when transitioning between steady state and deceleration.

David Crocker, Escher Technologies Ltd. http://www.eschertech.com Tel. +44 (0)20 8144 3265 or +44 (0)7977 211486

On 19/02/2015 21:45, Traumflug wrote:

I have added the ability to more easily switch between time based
and distance based segmentation to facilitate testing.
Excellent idea!
dc42 is segmenting to the 1us level, even if the semantics are
different.
I think he doesn't segment at all, the carthesian -> delta transformation is, just like acceleration calculation, part of his step delta time calculation so it's done for every single step of each axis. It's in DriveMovement::CalcNextStepTimeDelta() in DriveMovement.cpp. This "1 us" is about the accuracy he achieves with this calculation.
Pass enough previous dda info to the segmentation routine so that
each segment can be pre-populated with acceleration/lookahead values?
The worst thing about this segmenting is, the knowledge that two adjectant segments can be moved full speed and need no acceleration/deceleration ramps in between is thrown away, just to re-figure that fact with pretty time consuming maths a moment later.

So yes, passing a flag to dda_create() that a segment is an intermediate segment would make calling dda_find_crossing_speed() unneccessary, because crossing speed == feedrate in all these cases. Regarding ac-/deceleration ramp calulation it's more complex, because ramps stretch across several segments.

I still believe recalculating movement direction continuously in dda_clock() instead of segmenting is the most efficient way to go. Right before or after acceleration calculation. One has the momentary position there and it should be possible to calculate the momentary movement direction from that.

Which of both is actually, not only believed to be, more efficient, can only be seen by implementing and measuring them, I fear.

— Reply to this email directly or view it on GitHub https://github.com/Traumflug/Teacup_Firmware/issues/117#issuecomment-75143575.

NickE37 commented 9 years ago

I still believe recalculating movement direction continuously in dda_clock() instead of segmenting is the most efficient way to go. Right before or after acceleration calculation. One has the momentary position there and it should be possible to calculate the momentary movement direction from that.

I agree completely. It should be mathematically equivalent to 500 segments/sec (2ms) without the enqueue-ing and move creation overhead.

I am still trying to really understand the math for continuous calculation. Maybe I am way off base here, but the way I understand it: (Ignoring acceleration for now) Given a cartesian startpoint and endpoint and feedrate, there is a 3D vector that connect those points in cartesian space. That 3D cartesian vector can be partitioned into a number of steps and constant velocity for each of the three axes. Transformation to delta space results in a 2nd order polynomial path in cartesian space between startpoint and endpoint. When this is partitioned, it is a fixed number of steps, but non-constant velocity along the delta axes to obtain a straight path in cartesian space. The difference between the constant velocity in cartesian space and the non-constant velocity in delta space creates the error in position. Segmenting seeks to minimize the visibility of the error by resetting the step velocity of the delta axes.

There are two ways I can see to recalculate (if not doing it for every step, like dc42): (1) very,very computationally expensive: transform current delta position back to cartesian, apply a correction to actual, transform back to delta, adjust number of steps and velocity [totally unfeasible]

(2) very, very mathematically intensive, but computationally do-able: recast the cartesian to delta transformation equation to be based on initial conditions and fraction of move elapsed, then calculate the adjustment to delta axis step velocity. I am still refining this equation (fills several handwritten pages), but I can see it should refine to a calculation involving a single square root per dda_clock() (an dmany additions and multiplication operations that need to be done only once at dda_create() kinematic conversion. Acceleration/Lookahead done in dda_create() and dda_clock() should overlay onto this approach. I am working in this direction.

NickE37 commented 9 years ago

David, Thank you for helping me to refine my understanding. I appreciate it very much!

Traumflug commented 9 years ago

Nice graphs you have there. Just 7% error on such long moves is actually less than what I expected.

(2) sounds good.

Looking at dc42's code I currently see no reason to not just use his algorithm as-is. It calculates the time between two steps and that's pretty exactly what we need when recalculating not every step, but every 0.02 seconds. We'd simply keep this result until the next one is ready.

The only problem arising here is, this tends to accumulate geometrical errors (accumulating speed errors doesn't matter much, one would barely notice feedrate to be 10% off). Countermeasure would be to not calculate from expected position to target, but from actual position to target.

I'll have to check how the status of ACCELERATION_TEMPORAL currently is. It moves steppers independently just like dc42's. Written long time ago when there were no simulators and I could only inspect by listening to actual steppers wether an algorithm worked as expected.

dc42 commented 9 years ago

Looking at dc42's code I currently see no reason to not just use his algorithm as-is. It calculates the time between two steps and that's pretty exactly what we need when recalculating not every step, but every 0.02 seconds. We'd simply keep this result until the next one is ready. <<

Some linear moves require at least one of the carriages to move up for the first part of the move and then down again. If you don't calculate on every step, you may need to consider what happens when a reversal falls in the middle of your 20ms sample interval. Or you could segment moves to avoid reversals, which I considered doing but rejected.

David Crocker, Escher Technologies Ltd. http://www.eschertech.com Tel. +44 (0)20 8144 3265 or +44 (0)7977 211486

On 20/02/2015 00:33, Traumflug wrote:

Nice graphs you have there. Just 7% error on such long moves is actually less than what I expected.

(2) sounds good.

Looking at dc42's code I currently see no reason to not just use his algorithm as-is. It calculates the time between two steps and that's pretty exactly what we need when recalculating not every step, but every 0.02 seconds. We'd simply keep this result until the next one is ready.

The only problem arising here is, this tends to accumulate geometrical errors (accumulating speed errors doesn't matter much, one would barely notice feedrate to be 10% off). Countermeasure would be to not calculate from expected position to target, but from actual position to target.

I'll have to check how the status of ACCELERATION_TEMPORAL currently is. It moves steppers independently just like dc42's. Written long time ago when there were no simulators and I could only inspect by listening to actual steppers wether an algorithm worked as expected.

— Reply to this email directly or view it on GitHub https://github.com/Traumflug/Teacup_Firmware/issues/117#issuecomment-75168673.

NickE37 commented 9 years ago

I am getting confused here. Isn't the interval between dda_clock() 2ms rather than 20ms?

From timer.h

/// How often we overflow and update our clock.
/// With F_CPU = 16MHz, max is < 4.096ms (TICK_TIME = 65535).
#define TICK_TIME 2 MS

From timer.c

/// comparator B is the system clock, happens every TICK_TIME
ISR(TIMER1_COMPB_vect) {
    // set output compare register to the next clock tick
    OCR1B = (OCR1B + TICK_TIME) & 0xFFFF;

  clock_tick();
  dda_clock();
}

Traumflug commented 9 years ago

You're right, it's 2 milliseconds. Can be changed to 1 or 4 ms without further code modifications.

Regarding axes reversing: good point! Direction is given in dda_start() and isn't rewritten until the movement ends, so far.

Regarding this happening within the 2 ms interval: well, might happen unless a protection is written. Good thing is, speed of each axis changes only slowly, so the speed curve will be very flat and close to zero around the reversal point. Which means, chances are good this interval sends no or only very few step pulses.

I just tried ACCELERATION_TEMPORAL and lo and behold, it (still) works! Steps for each axis evenly distributed, even when they move at different speeds. You see the pulses for X and Y, the numbers are the velocities derived from these pulses. Slight (unavoidable) speed variation on the Y axis when two steps should happen at the same time, other than that close to perfect. Just no acceleration, no lookahead, no delta transformation.

acceleration_temporal

Getting this working requires two small patches, will push them in a few minutes to experimental.

NickE37 commented 9 years ago

Awesome on Temporal code!

I am refocusing my efforts towards no explicit segmentation, but towards adjusting step timing in dda_clock to keep the delta transformation on the cartesian path.

Given 100 steps/mm and 120mm/sec (cartesian)

equates to 24 steps per 2ms interval Delta:
maximum of sqrt(2)*24=33.9 steps per 2ms interval
a maximum velocity of 170mm/sec for that axis
a minimum step interval of 59.2us
maximum 17,000 steps per second (single axis)

This should be well within Teacup's capabilities on a Mega2560.

The equation that calculates the step time correction has been derived. I will begin testing it next.

Traumflug commented 9 years ago

I am in the process of setting up the simulator to try to see what is going on.

I just noticed SimulAVR chokes on a firmware compiled for an ATmega644P or 1284P, but works fine for an 644 without P. Just in case you didn't succeed because of errors. :-)

NickE37 commented 9 years ago

I have spent the last week attempting to implement delta movement using ACCELERATION_TEMPORAL and adjusting the axis step interval in dda_clock().
The process I am using in dda_clock works like this:

Assume that dda_clock will be called every 2ms
The fraction of the total cartesian move time is calculated from the move_state.last_time and the total cartesian move time calculated in dda_create
The optimum delta steps per axis are calculated based on the fraction of the cartesian move elapsed
The optimum delta steps per axis are compared to the actual delta steps per axis
A new step_interval for each axis is calculated that is the optimum step_interval plus a correction to bring the axis to the correct number of elapsed steps by the next call to dda_clock [For now, I am ignoring possible axis movement reversal]

The above works - mathematically, but fails in operation.

I am having a problem that I just cannot figure out how to solve:

The COMPB interrupt is executed every 2ms and calls dda_clock, except when the interrupt is turned off by the COMPA interrupt controlling the stepping. That makes the call to dda_clock somewhat irregular. To make matters worse, the COMPB interrupt gets turned back on after stepping is complete, but before the next step time and axis is calculated.
If the define STEP_INTERRUPT_INTERRUPTABLE is 0, then dda_clock is never called (or never completes a call) for faster feed rates, or is rarely called for modest feed rates (60 mm/sec), and irregularly called for slow feed rates (30 mm/sec). If the define STEP_INTERRUPT_INTERRUPTABLE is 1, then the move_state.last_time can be anything from 0 time elapsed since last dda_clock call to many multiples of 2ms worth of step time.
If the call to dda_clock is irregular, then it is impossible to calculate the correction to the step_interval to bring the actual to the optimum number of steps, since the correction factor has to be based on how long it will be applied:

[pseudocode] move_interval = move_state.last_time - prev_last_time opt_interval = move_state.last_time / opt_steps corrected_step_interval = move_interval / ((move_interval / opt_interval) + num_steps_off)

Maybe I have tunnel vision.

Some strategies I have tried:

Do not apply the correction, just the optimum interval [up to 0.5mm of error accumulate on a 50mm move]
Calculate to anticipate future optimum [suffers from same problem of irregular timing]

Any suggestions would be welcome. Thank you!

Traumflug commented 9 years ago

If the define STEP_INTERRUPT_INTERRUPTABLE is 0, then dda_clock is never called (or never completes a call) for faster feed rates, or is rarely called for modest feed rates (60 mm/sec), and irregularly called for slow feed rates (30 mm/sec). If the define STEP_INTERRUPT_INTERRUPTABLE is 1, then the move_state.last_time can be anything from 0

You certainly win a price as Teacup hero here, because so far nobody could clearly pin down what happens with STEP_INTERRUPT_INTERRUPTIBLE enabled. We used to guess, based on listening to stepper movements. Because of all this guessing I removed it last week, fixing it to 0.

Now you say dda_step() is never completed with it disabled. The only thing I can conclude here is, processing capabilities are already exhausted, because with ACCELERATION_RAMPING it works. Or did you simply confuse 0 with 1?

I didn't tell so far, but about a year back I tried to calculate acceleration purely on time. This failed miserably, deceleration ramp end and movement end often mismatched, with no clear pattern, often by some 20 steps. This code is in attic/dda_clock/ in case you want to have a look. Your observations pretty much explain on why this failed.

I think the solution should be to make the 2ms interrupt regular and reliable. Can't shake such a fix out of my sleeve, I have to investigate the data sheet for this.

Other than that, feel free to simply push code on a branch, or on several branches. Branches can be used like a dropbox. Perhaps me or others can see things you don't.

NickE37 commented 9 years ago

It is dda_clock() that would not complete with STEP_INTERRUPT_INTERRUPTIBLE enabled and high feed rates. dda_step() does complete. That would explain why acceleration calculations based on time would be irregular.

NickE37 commented 9 years ago

Additionally, the code from dda_step() is: #if STEP_INTERRUPT_INTERRUPTIBLE && ! defined ACCELERATION_RAMPING // Since we have sent steps to all the motors that will be stepping // and the rest of this function isn't so time critical, this interrupt // can now be interruptible by other interrupts. // The step interrupt is disabled before entering dda_step() to ensure // that we don't step again while computing the below. sei(); #endif

So, dda_clock() can still interrupt the last part of dda_step when ACCELERATION_RAMPING.

Traumflug commented 9 years ago

#if ... ! defined ACCELERATION_RAMPING
  sei();
#endif

So, with ACCELERATION_RAMPING the sei() doesn't happen and the global interrupt flag stays cleared, disabling all interrupts.

That said, your snippet is exactly the code which was removed, including the sei().

The step interrupt obviuosly needs high priority. What I'm currently investigating is, why this apparently stops the clock interrupt from firing. My hope is, the clock interrupt is remembered until interrupts are globally enabled again.

Traumflug commented 9 years ago

Perhaps you've seen it, there's a re-entry protection some 20 lines of code into dda_clock(). It's certainly a good idea to update the clock counter before this.

NickE37 commented 9 years ago

It's certainly a good idea to update the clock counter before this.

I am unclear what you mean by this. Which clock counter? Update how?

Traumflug commented 9 years ago

Which clock counter?

If you want to calculate on movement time you ineviteably have to keep track of how much time has passed since the movement started. Currently there is no such counter, but I assume you added one.

Update how?

current_movement_time += 2 MS;

NickE37 commented 9 years ago

For ACCELERATION_TEMPORAL, there are clock counts for each axis (move_state.time), plus one that tracks the clock count of the last axis to step (move_state.last_time - total clocks for the move so far). Does move_state.last_time not give how much time has elapsed since movement started?

Traumflug commented 9 years ago

Yes, you're right. I actually missed this when doing clock based acceleration (for ACCELERATION_RAMPING), thanks for showing it. Looking at it, I see a few caveats:

move_state.last_time isn't reset to zero at movement start. That might be intentional, to avoid a gap in step frequency across movements. It can be changed, of course.
It isn't very accurate, because it's added up by the step periods. This inaccuracy might be actually an advantage over keeping a clock solely in dda_clock().
It's updated when a step happens only, which is rarely at early acceleration time.

NickE37 commented 9 years ago

I pushed out the branch delta-temporal that has some prototype code at the end of dda_clock() to adjust delta movement. There is a minor change in the ACCELERATION_TEMPORAL section of dda_step() to allow for negative time possibilities for choosing the next axis to step. This can happen when step intervals get altered in dda_clock().

Traumflug commented 9 years ago

@NickE37: Perhaps you've seen it in issue #121, I rewrote step scheduling for ACCELERATION_TEMPORAL. It's on the interruptreliability branch, especially the last 3 commits.

To my tests dda_clock() works reliable now. I wonder what your testing method says about this.

NickE37 commented 9 years ago

Thank you!

I am in the process of incorporating your changes now. I will keep you posted.

Traumflug / Teacup_Firmware

Delta Printer Support #117