repetier / Repetier-Firmware

Firmware for Arduino based RepRap 3D printer.
813 stars 734 forks source link

Feature hint: PID autotune with Tyreus-Lyben for highpower heated beds #712

Open Nibbels opened 7 years ago

Nibbels commented 7 years ago

Hello there :)

We dont want to keep our findings a secret. Thatwhy I post this here: Playing with the PID control did improve a lot in my firmware. But the heated beds where not perfect. So we tried to find other tuning methods than Classic PID/ Ziegler Nichols, No Overshoot, Some Overshoot and Pessen Integral Rule.

NEW: -> Tyreus-Lyben

https://www.slideshare.net/AhmadTaan/pid-controller-tuning-49463158 Tyreus-Luyben  An improvement for Ziegler-Nichols closed-loop to make response less oscillatory  More robust to imprecise model  Gives better disturbance response  Procedure:  Same procedure as Ziegler-Nichols closed-loop June 16, 2015 12University of Jordan, Department of Mechatronics Engineering, 2014

http://www.simcae.com/DRUPAL/files/TuningBrochureClosedLoopProcedure.pdf

Reason? We always got this sort of ripple leftovers in our temperature when controlling the heated bed. Using Thyreus-Lyben with some SSR controlled 230V heated bed now we got rid of that ripple. :) Pictures and discussion happened here: http://www.rf1000.de/viewtopic.php?f=23&t=2054&start=50#p20832

-> https://github.com/Nibbels/Repetier-Firmware/blob/community_development/Repetier/Extruder.cpp#L1435

Well, different sources tell different constant values to tune. They seem to mix up KP from PID tuning with KP from PI tuning. But this seems to be working perfect for our heated beds:

                        case 4: //Tyreus-Lyben
                            Kp = 0.4545f*Ku;      //1/2.2 KRkrit
                            Ki = Kp/Tu/2.2f;        //2.2 Tkrit
                            Kd = Kp*Tu/6.3f;      //1/6.3 Tkrit[/code]
                            Com::printFLN(Com::tAPIDTyreusLyben);

https://github.com/repetier/Repetier-Firmware/blob/development/src/ArduinoAVR/Repetier/Extruder.cpp#L2446

Greetings from Stuttgart


PS: I am still more and more sure that PID-I-Drive-Min should always be considered to be a negative limit value ;) https://github.com/Nibbels/Repetier-Firmware/blob/community_development/Repetier/Extruder.cpp#L372 PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR = -1 PID_CONTROL_DRIVE_MAX_LIMIT_FACTOR = 10 vs. https://github.com/repetier/Repetier-Firmware/blob/master/src/ArduinoAVR/Repetier/Extruder.cpp#L605

boelle commented 6 years ago

@Nibbels so you have a new PID tune algorithm for high power beds?

i think that @repetier would like a PR of your work and i'm sure if it works and makes sense it will be included in V2 of the firmware

repetier commented 6 years ago

Thanks. Have added it in dev to be included in 1.0.1.

PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR = -1 PID_CONTROL_DRIVE_MAX_LIMIT_FACTOR = 10

the factor 10 is because we have 10 updates of i part per second. pidDriveMin can be negative if you set this is config, also I'm not sure why this gives stability. The idea here is to use some knowledge about the required final I factor to speed up convergence. If everything is converged, P and D are more or less zero and output is the I part and that is always positive. -1 just allows starting with 0. Why do you think it is a bad idea to start in target range? All I can think of is that it might have influence on the first swing, after that it should be in the target range in any case and far from negative values.

BTW: Your code is missing a break in switch case, so you will actually use the no overshoot values and not Tyreus-Lyben PID!

Nibbels commented 6 years ago

Oh dear... what a dumb bug according to the "break;". I will have retest all of it.

For the logs: screenshot_11 screenshot_10 Without the new set of autotune using "Classic PID" (but with patched negative I-border) I had this: screenshot_12

Now this: screenshot_14 screenshot_13

About the I-Limit: Whatever was ment with this:

    tempIStateLimitMax = (float)pidDriveMax * PID_CONTROL_DRIVE_MAX_LIMIT_FACTOR / pidIGain;
    tempIStateLimitMin = (float)pidDriveMin * PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR / pidIGain; 

For me it was just some char within the eeprom, which could be set from 0 to 255. That got scaled by 10 and scaled down by the active I-gain to produce some sort of a border. The value that gets capped by this border is ment to be the sum of all the errors in temperature over time which is not eliminated by P and D alone. In my opinion the needed thing is, that one of the borders is somewhere inside the negative. How big it actually is should not really matter if it is (negativly)big enough.

That is a youtube screenshot from the old firmware i had (the companys official firmware that produces my printer): screenshot_15 The Bed and normally the extruder too always had the +1°C or +2°C. You see 62/60°C (M116 sometimes did let the printer wait forever, because temp was out of +-1°C.)

Question: Did you ever see this constant overtemp too? (You should have seen more printers than I did.)

In that moment I turned the PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR to -10 I got my temperatures to stick to the real target temperature. (New autotune)

My next experiment was to decrease the limit to see when it is starting to do malicious things. That happend when I set pidDriveMin to 0. And when It was 1 (=1-10=-10) it worked. So I divided the scale of this border /10 by decreasing PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR to -1 instead of -10. So I saw that on my printer my control got unstable when I dropped the border to around 5-1=-5 At my friend this bad behaviour startet at around -8. The more close I come to the level where it gets bad the less overswing I have. So I left the PID_CONTROL_DRIVE_MIN_LIMIT_FACTOR at -1 to have the most ganular eeprom control for the negative part of the border. -> I dont scale the Time, I scale the char inside eeprom to -0 .. -255 instead of -0 to -2550.

Btw: I needed PID_CONTROL_DRIVE_MAX_LIMIT_FACTOR to be 10. So that pidDriveMax*PID_CONTROL_DRIVE_MAX_LIMIT_FACTOR was around 800 to 1000 to work. Why? Heating up something is more dynamic than cooling it down.

I will check back here with pictures when I know how Tyreus Lyben really performs. shame on me

repetier commented 6 years ago

If I put the parts together:

        tempIStateLimitMax = (float)pidDriveMax * 10.0f / pidIGain;
        tempIStateLimitMin = (float)pidDriveMin * 10.0f / pidIGain; 

and later

               act->tempIState = constrain(act->tempIState + error, act->tempIStateLimitMin, act->tempIStateLimitMax);
                pidTerm += act->pidIGain * act->tempIState * 0.1; // 0.1 = 10Hz

You see the 10 and 0.1 remove each other and the same for pidIGain. So all the integrated I term does is add a pwm to the total and I hope you agree that when you are near the temperature P and D will go towards 0, so I must be the non zero part. I assume here yo are talking about stable/unstable close to target temperature. So act->pidIGain act->tempIState 0.1 would be e.g. around 200. For a negative limit to have some effect you need to subtract very often big values. So I do not really see why it should have any effect here. But maybe if you log the values over time it might make some sense or reveal an error.

Not hitting a target temperature is often a problem if cooling and heating have very different speeds, so you would essentially need 2 solutions depending on if you are above or below target temperature. Sometimes you get problems to not reach it or it will not go below. Have seen both happen and that is what I think might be the problem. More often reaching it is the problem especially if you are close to heaters limit. With pid it normally works fairly well.

Nibbels commented 6 years ago

For a negative limit to have some effect you need to subtract very often big values. So I do not really see why it should have any effect here.

If we have something like a balance between P and D, this balance can be shifted by some value. upwards and downwards. Because the I sums up all the shift over time it forces P and D to converge as a couple.

When I am near the targetTemperature P/proportional gets small and finally 0. Wenn I am near the targetTemperature D/differential is as big as the change was from the last measurement. (Can be small or can be some other number according to measurements noise.) When I am near the targetTemperature I/integral gets small whenever the error over time gets wiped out by a sum of plus and minus errors. Noise gets summed up to around 0, but lets D work +- all the time.

The currentTemperature can exactly be the targetTemperature even if the integral sum of the errors is not zero. That is, because the swing of the Integral can be the opposite part of the swing of P plus D.

A mathematical PID control should only need this code as Integral part:

float error = act->targetTemperatureC - act->currentTemperatureC;
...
act->tempIState = act->tempIState + error; //act->tempIState += error
pidTerm += act->pidIGain * act->tempIState * 0.1; // 0.1 = 10Hz

It has no boundaries at all, but we discuss about the size/values of these boundaries! That is the whole point of this fix and these posts here All three values: P + I + D should converge towards a zero error.

If they are not converging and if they show some sinus-shaped currentTemperature diagram the PID is not really stable and that might be the case whenever the autotune did its job wrong or hardware/masses/deadtime change. The autotune should calculate the influence of _PID_P and _PID_I and _PID_D (See: https://github.com/repetier/Repetier-Firmware/blob/01c142dc871a6953eba2714a15e235532d7d6e40/src/ArduinoAVR/Repetier/Configuration.h#L342) to a specific weight so that the PID is stable and robust against small changes. Different Autotune Presents like Classic PID, Pessen Integral Rule, No Overshoot, Some Overshoot and Tyreus-Lyben should as well calculate the influence of _PID_P and _PID_I and _PID_D a bit differently in the regard that we get: 1) the smallest possible overswing when reaching the target temperature the first time 2) and in regard that it reaches the target temperature very fast. Those two optimal goals are a conflict. No overswing means slower and more stable. More overswing means faster and more instability/swing in general. I guess that more deadtime as example "needs a bit more stability" to work good. The right autotune present lays its optimum somewhere inbetween those optima.

That is why this line is some hack, that reinfluences the PID after we have chosen good values according to our best autotune strategy: act->tempIState = constrain(act->tempIState + error, act->tempIStateLimitMin, act->tempIStateLimitMax);

Lets fill in numbers: act->tempIState = constrain(act->tempIState + error, 40, 255); -> The sum of errors is always between 40 and 255. The ntergral part will only work for temperatures beneath the target temperature.

act->tempIState = constrain(act->tempIState + error, 0, 255); -> The sum of errors is always between 0 and 255. The ntergral part will only work for temperatures beneath the target temperature but it might vanish.

act->tempIState = constrain(act->tempIState + error, -255, 255); -> The sum of errors is free but has bounds. The PID can get steered upwards and downwards by the intergral part.

Why might this hack be a good thing nevertheless? When I increase -255 to some smaller number (-20, -10, ...) I can see that if the targetTemperature drops it will not shoot underneath act->targetTemperatureC while doing the first big swing so much. When I decrease 255 to some smaller number (120, 110, ...) I can see that if the targetTemperature rises it will not shoot over act->targetTemperatureC while doing the first swing so much.

When might this hack become a very bad thing? When I decrease 255 to much the temperature will stay underneath targetTemperature if P and D cannot compensate the missing I. When I increase -255 to much (-3, -2, -1 .. 0) the temperature will stay atop the targetTemperature if P and D cannot compensate the missing I. The leftover amount of temperature offset seems to be dependent of the _PID_P and _PID_D parameters.

Why is the pid-control working regardless of my criticism

Because if the control is faulty the autotune makes values which are not perfect but working too. But possibly the control cannot get as stable as it was supposed to be. So the generated P+I+D values for the configuration.h are not what they should be, because P and D have to compensate for the disabled I.

See here: Totally bogus values.

/** \brief P-gain. */
#define HT3_PID_P                           95
/** \brief I-gain. */
#define HT3_PID_I                           120
/** \brief Dgain. */
#define HT3_PID_D                           130

https://github.com/RF1000/Repetier-Firmware/blob/0e52693eaed1f606af411e4898030b033304b94f/RF2000/Repetier/RF2000.h#L180

And those are perfectly working values for the same hotend without I-boundaries:

/** \brief P-gain. */
#define HT3_PID_P                           12.5
/** \brief I-gain. */
#define HT3_PID_I                           3.2
/** \brief Dgain. */
#define HT3_PID_D                           18

https://github.com/RF1000community/Repetier-Firmware/blob/4b176392023c1b1ce510fe1d7877c5a95911dbc9/Repetier/RF2000.h#L168

When I do the scan with the corrected boundaries (+- someting) an E3D-Hotend instead of Conrad Renkforce V2 Hotend I get those: P = 8.3972 I = 0.7352 D = 35.9660

Greetings

(And please crush me if I am wrong, which might be the case in some detail, but I am quite sure that what I wrote here is right.)

Nibbels commented 6 years ago

Not hitting a target temperature is often a problem if cooling and heating have very different speeds, so you would essentially need 2 solutions depending on if you are above or below target temperature.

That is the reason why setting the upper boundary to something like 1200 and the lower boundary to something like -5 or -8 works at best for my hotend. But this is something that needs an extra boundary-autotune or a real geek.

Greetings

repetier commented 6 years ago

Ok, that is a good point I did not imagine would happen. Looking back at your images I see you are so overpowered that you have a high sensitivity to changes and with big D values you can in deed easily sum up i into negative part and then it breaks the original PID. So it might even be a good thing to omit this extra boundary tests at all making it easier to adjust. I know you mentioned th eundershoot getting less worse, but that would be how pid would originally work and it would quickly start to stabilize.

In V2 firmware a heat manager will be a separate class making it a bit easier to do such expermients once I have them modularized. Then I will do some tests with that regard.

Nibbels commented 6 years ago

For that simple reason that you have other things to do than waiting for pidAutotunes: I did some tests today.

The most valuable information is that we really need those boundaries. That does not seem to depend on the type of autotune strategy.

I flashed some mod of my firmware fork and killed the boundaries. act->tempIState += error After a successful flash I let autotune work on both heated bed and one E3Dv6 and got these results:

Bed: Classic Ziegler-Nichols PID P = 69.039 I = 7.913 D = 150.582

T0: Classic Ziegler-Nichols PID P = 6.6415 I = 0.4521 D = 24.3908

Bed: Pessen Integral Rule PID P = 80.545 I = 11.043 D = 220.311

T0: Pessen Integral Rule PID P = 8.0130 I = 0.6857 D = 35.1127

Bed: Some-Overshoot PID P = 32.547 I = 3.658 D = 193.035

T0: Some-Overshoot PID P = 3.7776 I = 0.2581 D = 36.8565

Bed: No-Overshoot PID P = 27.615 I = 3.338 D = 152.326

T0: No-Overshoot PID P = 2.2486 I = 0.1559 D = 21.6154

Bed: Tyreus-Lyben PID P = 44.825 I = 1.128 D = 128.520

T0: Tyreus-Lyben PID P = 5.2027 I = 0.0815 D = 23.9582

The problem I encountered on all my heatup and cooldown tests was, that the error sum was not eaten up fast enough. It got too big, as expected. That means that whoever build those boundaries was not wrong at all. He just forgot to see that we need one negative boundary too - in my opinion.

Tyreus Lyben without boundaries: t0 bed tyreus lyben t0 bed tyreus lyben 2 t0 bed tyreus lyben 3 t0 bed tyreus lyben 4

Tyreus Lyben with (nicely chosen) boundaries (reflashed firmware, no change in EEPROM): t0 bed tyreus lyben 1

My conclusion is that we really have to build in a limit to the errors sum to make a good use of PID controll in a printer where short over-/undertemperature might matter. On all tests after some while the temperature had no offset at all (+-0.1°C ..+-0.3°C). But it took too long to reach that. The reason for this might be that the PID control does not have some scope where it is working in. We have this scope: "+-5°C"

Ok, well: More pictures which I would just delete now. But they could help for someone to look in deeper. All the pictures beneath here are without boundaries! The autotune was made with no boundaries within the control algorithm as well. (if that matters)

Bed: Classic Ziegler-Nichols PID P = 69.039 I = 7.913 D = 150.582 bed classic ziegler-nichols pid bed classic ziegler-nichols pid 2

T0: Classic Ziegler-Nichols PID P = 6.6415 I = 0.4521 D = 24.3908 t0 classic ziegler-nichols pid t0 classic ziegler-nichols pid 2

Bed: Pessen Integral Rule PID P = 80.545 I = 11.043 D = 220.311 bed pessen integral rule bed pessen integral rule 2 bed pessen integral rule 3

T0: Pessen Integral Rule PID P = 8.0130 I = 0.6857 D = 35.1127 t0 pessen integral rule t0 pessen integral rule 2 t0 pessen integral rule 3

Bed: Some-Overshoot PID P = 32.547 I = 3.658 D = 193.035 bed some overshoot bed some overshoot 2 bed some overshoot 3

T0: Some-Overshoot PID P = 3.7776 I = 0.2581 D = 36.8565 t0 some overshoot t0 some overshoot 2 t0 some overshoot 3

Bed: No-Overshoot PID P = 27.615 I = 3.338 D = 152.326

T0: No-Overshoot PID P = 2.2486 I = 0.1559 D = 21.6154 t0 no overshoot 1 t0 no overshoot 2

Bed: Tyreus-Lyben PID P = 44.825 I = 1.128 D = 128.520 T0: Tyreus-Lyben PID P = 5.2027 I = 0.0815 D = 23.9582 t0 bed tyreus lyben t0 bed tyreus lyben 2 t0 bed tyreus lyben 3 t0 bed tyreus lyben 4

Excel shows:

screenshot_1

Even If I would love to have some "Boundary-Autotune" aside the PID autotune, I am quite happy with my chosen boundaries I found (~1200 vs. -8).

Greetings

repetier commented 6 years ago

Actually addition of boundaries was my idea, mean I got it without knowing if someone uses it similarly. What your pics show and what the idea was is that when you start with low temperature I will sum up with big errors and once you are at target temperature it will be a big value - much too big and needs to be reduced over time while overshooting. If then cooling is too low you get the same effect. So idea was to limit I integral to a range that we know we should be in. I mean adding 1000 to I where 255 makes really not much sense, but takes longer to get back into usable range.

What you now showed it that it is not good to limit it to target range, but that it also needs some room to breath - means compensate P and D terms so sum is within reasonable range.

So your optimized boundaries have the effect of reducing the possible error to a range that is big enough to breath and thus helps converging.

I think with some more knowledge it should be possible to combine dead time for first in swing with pid. Dead time has normally no overshoot in calibrated range, but i snot very flexible to adjust as good as PID so it swings depending on the timing match. But if we start after e.g. 20 seconds dead time with PID and set I to a starting curve" temperature - I required" we should get a very good start and not much swinging at all. At temp. change we do again first dead time and then switch. Just an idea to add some knowledge to optimize stability.

Nibbels commented 6 years ago

Ok,

But please dont forget that it actually works :) And it works great - if you know how. Except this little detail that I think I-drive-min should always have a minus.

I only did start complaining and researching the PID control because my firmware was from 2014. And it had i-drive-max=40 plus i-drive-min=40 because of some predefined typo. (and those spikes at fast printspeeds because of this HAL::allowInterrupt overflow) And this unsteadiness in sensortype-8 temperatures table which made autotune at 230°C with sensortype 8 a total waste of time.

This is near perfect control: screenshot_1 screenshot_2 screenshot_4

I guess, IF I wanted to have some boundary-autotune, I would have to set the boundaries to something like +-max. (Or the biggest values that gets produced without a boundary) Then swing up from beneath https://github.com/repetier/Repetier-Firmware/blob/01c142dc871a6953eba2714a15e235532d7d6e40/src/ArduinoAVR/Repetier/Configuration.h#L572 PID_CONTROL_RANGE 20 and decrement drive-max within multiple intervals, Until the temperatures goal is not reached anymore. The same for swing-under with drive-min.

Maybe I test this some day. Or anyone else knows some magic e^iX-control-theory-formula for that.

Greetings

PS: I set tempIState to zero whenever I ran out of the control range. Dont know anymore if this really helps, I got the idea from somewhere there: https://github.com/br3ttb/Arduino-PID-Library -> http://brettbeauregard.com/blog/2011/04/improving-the-beginners-pid-introduction/

repetier commented 6 years ago

Setting IState 0 outside control range is more or less what I meant. If you did not do that and control range would be 200°C it would sum up to some very big value once you are at target temp and it takes forever to undo that accumulated error especially without boundary. So resetting to 0 if coming from below is good and from atop it is not that bad. With some pretune we might find better start values and can later switch with the dead time approach. But that will be some fun for later.

Nibbels commented 6 years ago

To be totally clear: I already did zero tempIState since some month ago. All the tests were made with that. https://github.com/Nibbels/Repetier-Firmware/blob/825a108f7d1976517dafb6082408b68f7472692c/Repetier/Extruder.cpp#L133 [ff]

            if( act->targetTemperatureC < 20.0f )
            {
                output = 0; // off is off, even if damping term wants a heat peak!
                act->tempIState = 0;
            } 
            else if( error > PID_CONTROL_RANGE )
            {
                output = act->pidMax;
                act->tempIState = 0;
            }
            else if( error < -PID_CONTROL_RANGE )
            {
                output = 0;
                act->tempIState = 0;
            }

so far :) If anyone continues this thread I will see it.