Ralim / IronOS

Open Source Soldering Iron firmware
https://ralim.github.io/IronOS/
GNU General Public License v3.0
7.24k stars 718 forks source link

Improved temperature estimation #360

Open dhiltonp opened 6 years ago

dhiltonp commented 6 years ago

We now have a record of watts put into the system.

Using a history of watt output and temperature, we should be able to very accurately report the raw temperature without lag...

dhiltonp commented 6 years ago

The effective tip temperature is always lower than the sensor.

The more power going into the tip, the worse the sag. We should be able to counter this effect...

Ralim commented 6 years ago

We should be able to.

Also helpful to know that the PID code is always effectively running 1 sample delayed as well. (But in lock step).

At the end of a reading it starts the PID code, which runs and sets a new output, then at the end of the next PWM period, the new value is copied into the output PWM and the PID is triggered again. This results in the PID code running while the PWM is doing the output for the last calculated value. So there is a known constant delay in the control loop as well.

dhiltonp commented 6 years ago

I'm seeing some oscillation in the PID controller, I think it's related to the PWM change from 100 to 256 - basically, our temp sample can be further from the power output than before, so the energy has more time to dissipate towards the tip from the base.

Adjusting the tip tip temp based on power and PWM should help.

This is basically adding a feed-forward aspect to the controller.

Ralim commented 6 years ago

Temperature latency should not have changed dramatically (I multiplied the counting speed by 2.55x to compensate for more counts).

But i 100% agree that a feed forward is probably a really good idea.

I could look into increasing the PWM frequency to improve the PID update rate if you wanted ?

dhiltonp commented 6 years ago

Huh. That's interesting.

I had to retune the PID controller afterwards - I increased the sample history by 50% and had to change the P damping (mass divisor) from 4 to ~20, and it's still not as good as before..

This temp estimation should help with that, but now I'm confused - I'm not sure what else changed. I'll have to double-check the current version to make sure I didn't mess up my math during the rewrite.

dhiltonp commented 6 years ago

Let's hold off on changing the PWM frequency until we see the feed-forward results.

Ralim commented 6 years ago

Yeah, just let me know, Ill try and have a look at timing on my units when I get a break to check that I didnt mess up any maths.

dhiltonp commented 6 years ago

It was probably the removal of any temp filtering that did it. I turned it way down and didn't notice anything so I took it all out.

I'll verify this tonight.

dhiltonp commented 6 years ago

Shoot, it's not the removal of filtering.

dhiltonp commented 6 years ago

Ok. I think it may be entirely on my end.

The build from here was quite solid and stabilized at a given temp, as I recall. https://github.com/Ralim/ts100/issues/275#issuecomment-420197231

The current build seems to exhibit the same behavior.

It could easily be my tip (see: #395).

I'm going to have to order a new one.

In the mean time, could you verify that the current version performs well for you? My tip goes 10C higher than requested, then it drops in temp, without ever stabilizing.

I could alter the algorithm to work with my tip (and maybe I should make it more robust to bad tips), but I'd like some external data.

dhiltonp commented 6 years ago

Re-reading, I failed to clarify - the old firmware worked quite well originally.

Now, both the old and new versions have the same pulsing behavior for me. I believe my tip (and my abuse of it in testing thermal performance) is to blame.

Ralim commented 6 years ago

Ah okay, yeah that does sound more like a tip issue then. There are also the hakko tips for cheap that are compatible that I have used as sacrificial testers before.

dhiltonp commented 5 years ago

I tried to get temperature data using soldering iron thermocouples and my fluke 287, but I'm seeing a lag of about 5 seconds due to some combination of time for heat transfer and averaging in the multimeter.

Is your FG100 much more responsive or is this representative?

dhiltonp commented 5 years ago

It looks like the lag is not in the thermocouple or display, but is due to heat conduction:

I set the tip to D24, but didn't calibrate the tip temp in software.

Calibration (63/37 solder, 183C ~= 160C on display): https://www.youtube.com/watch?v=cmNolzz65N0

Rocketing past 183C at full temp - 7s delay on full power: https://www.youtube.com/watch?v=32vVcMoATaM

Targeting 180C on iron (~200C actual) - 6s delay, despite power throttling back: https://www.youtube.com/watch?v=-7b8_GrFg2M

I'm guessing the solder melted first when targeting 180C because the solder was slightly closer to the heating element.


It takes 6-7 seconds after the iron says a temperature is hit for the tip to be that temp. With this in mind, I think that the thermocouple setup is just fine. I wonder if it'll take that same 6-7 seconds for the sensor to know that tip temp is dropping due to added thermal mass. I'll try to determine actual responsiveness/reaction time to changes in mass tonight.

dhiltonp commented 5 years ago

I'd like to play with adjusting the temp compensation. Instead of saying "at xxx raw temp, the corrected temp is xxx", I'm thinking of saying "At xxx power, the temp will be under by xxx".

I think this may compensate both for tip types as well as when the iron is in use.


In this test (at 54s), when the thermocouple is not touching the heat sink, the power output to maintain the iron at 300C is ~4W. The external thermocouple is in the same ballpark.

But around 40s, the power output to maintain 300C is 17W - and that's just maintaining the tip temp at the internal thermocouple. The external thermocouple temp is much lower - around 200C.


lookupTipDefaultCalValue has linear compensation based on the data provided by @Repled. What value should I use for 0 compensation?

Ralim commented 5 years ago

lookupTipDefaultCalValue is the gain of a y=mx+b line of best fit. (aka the m) b is the offset that is calibrated per unit as this is ADC offset.

So not sure what you want by 'no compensation'

Since normally : tip_Temp = HandleTemp_degC+((raw_tip-ADC_Offset)*gain)

FG100 is more responsive but not massively sadly. Best option is to have the thermocouple welded to the tip which is hard to do. I did do some testing by basically getting the thermocouple somewhat soldered on and then testing at 0-150C for first control loop trials

dhiltonp commented 5 years ago

I believe there used to be an m based on how the thermocouple should react, not based on real-world testing - I'm not sure how to calculate that is all.

Ralim commented 5 years ago

Thermocouples are mostly linear in the range that we are using them in. There used to be one back in firmare 1.x but in 2.x I went for measured values. The m that is there should just be modelling the thermocouple gain which is not compensated for by tip type per se, but more that different tips have different junction styles (how they terminated the wiring) which can lead to difference responses.

dhiltonp commented 5 years ago

Huh, ok.

I'll keep that in mind, for now I'll try to estimate tip temp given power history.

hattesen commented 5 years ago

With reference to the discussion about automatic PID tuning in #444 I would like to make some suggestions that would be helpful in solving this issue.

I would ideally like to be able to model the thermal aspects of the interactions of the components in the energy transfer chain (voltage -> power -> heating-element-temp -> thermal-conductivity -> tip-metal-center-temp -> tip-thermal-conductivity -> tip-metal-end-temp), as that would allow me to be able to perform simulations, using various compensation techniques, variables (input voltage, thermal mass of tip etc) and apply tolerances.

Do you have any "real world" measurement data, from which to derive the model? Ideally, as raw measurements of time->temp of a heat-up cycle:

  1. Apply full power to heating element (100% PWM duty cycle)
  2. When the tip temperature reaches a "typical" target temp: switch off power to heating element (0% PWM duty cycle)
  3. Until temperature approaches (half way) ambient temperature

It would be great to have both thermal-element temperature (derived from the electric resistance of the element via amplifier and ADC), as well as the actual (near tip-end) temperature, which could be obtained using a simple Type-K thermocouple, but just having temperature data from the internal measurement (heating element) of the above cycle, would be of great help, as I can see that the time lag (around 5 seconds) is documented by observation.

Along with the above data, information about the tip (type/name and its mass) as well as input voltage would be required. Additional data to allow modeling changes to tips and input voltage: The mass of the currently available tips. And finally, to save me from reverse engineering the relationship between heating element temperature and its electrical resistance, it would be good to have to be able to calculate the voltage -> power function at a given temperature.

Do any of you have the above mentioned raw data available, or would you be able to (re)produce it?

I do have some practical experience in solving similar situations, although mostly they are the other way around (i.e. the measured temperature lagging behind the actual temperature). In must cases, I have been able to derive an algorithm that was able to achieve a fairly close match to the actual temperature, at both rising, stable and falling temperatures. The algorithm has been based on the measured temperature, as well as knowledge of historic applied power as well as the known thermal conductivity and thermal mass of the thermal system.

In this case, we would need to know the power that has been applied for a period of at least 5 seconds (time lag), to be able to estimate how much heat energy has been deposited, and is "on its way" to the tip. One suitable recording of past power would be a record of added thermal energy (joules = time_seconds * voltage^2 / ohms), measured at half second intervals, and recorded in a circular buffer holding at least 10 elements (matching the observed 5 second time-lag). It may sound too complicated, but once the mathematical model is in place, simulations can be made, and a fairly simple algorithm can be derived, that provides an accurate estimate of the current tip temperature that we need to be able to provide a responsive, yet stable, PID controller that provides a predictable and stable soldering tip temperature.

A side benefit of this works would be to be able to report an accurate tip temperature to the user, rather than report an internal element temperature that is an indication of where the tip temperature will be in 5 seconds, or so.

dhiltonp commented 5 years ago

I don't have any real world data other than those videos (and a few others that aren't accessible). Extracting data from the iron is difficult. Instead of outputting data electronically, data could be displayed and recorded at 120 or 60fps on camera and either manually extracted or perhaps done via OCR.

The iron does maintain milliwatt output (3.5s) and temperature error history (.5s), along with rolling averages. The PID loop updates at 32hz.

PID loop: https://github.com/Ralim/ts100/blob/master/workspace/TS100/src/main.cpp#L928 Power: https://github.com/Ralim/ts100/blob/master/workspace/TS100/src/power.cpp Temp: https://github.com/Ralim/ts100/blob/master/workspace/TS100/src/hardware.c History: https://github.com/Ralim/ts100/blob/master/workspace/TS100/inc/history.hpp

hattesen commented 5 years ago

@dhiltonp, I'll see what I can get out of the data in the videos. Thanks for the details on the PID setup.

One well-documented approach to solving time delays is the integration of a "Smith Predictor" in the feedback loop. This approach is described in PID Tuning for Time-Varying Delay Systems Based on Modified Smith Predictor.

Are you able to provide some of the following data, which would assist me in creating a quick model of the heat transfer:

  1. Average applied power, when a steady state is reached (typical use tip temperature). This would be computed as PWM-duty-cycle * supply-voltage^2 / heating-element-resistance - even a rough estimate would be great.
  2. Mass (weight) of the tips - Individual or as a range.

Those, along with the videos would get me started.

dhiltonp commented 5 years ago

To sustain a temp around 320C is 4.5 watts.

I don't have the tip mass available.

hattesen commented 5 years ago

I have had a quick look at the code currently controlling the (heating element) temperature. It looks to me like a pure P(roportional) control algorithm with an added compensation using an average of recent applied power/energy. As far as I can see, there is no I(ntegral) or D(erivative) parts of the control, which surprises me a. Although I am unable to fully comprehend the effect of the "average recent power" compensation, I cannot see how this algorithm would ever be able to provide temperature control that is both stable and responsive at the same time – especially when subjected to varying conditions, such as tip (thermal) mass, maximum power (voltage) and a wide range of target temperatures. To guarantee stability (no oscillation), the P(roportional) gain will have to be set conservatively low, that the responsiveness is very low, resulting in slow reaction to disturbances (like cooling of tip when soldering) as well as a relatively slow approach when nearing the target temperature, when heating up.

Adding an I(ntegral) control (applied power compensation) will ensure that the measured temperature will not have a (constant) offset relative to the target temperature.

Adding a D(erivative) control (applied power compensation) will reduce/eliminate oscillation (typically when heating up), allowing the P(proportional) gain to be increased, thereby achieving a more responsive (faster reacting) controller.

Without a full PID implementation to start with, we really have no knobs to twiddle (parameters to adjust), and while we may still achieve a decent control, we would have to reinvent the wheel, when trying to cope with the consequences of known variations in the heat transfer system, and probably end up having implemented something similar to a PID (proportional–integral–derivative) controller, but without all the benefits of using an industry standard algorithm that has been controlling processes for nearly 100 years.

I would really like to assist in achieving a solid, future proof, control algorithm, that:

As I have stated earlier, I believe that the most efficient way of achieving the above goals is to use a standard PID control algorithm, possibly augmented to compensate for the relatively slow (approx 5 sec) heat transfer from heating element to tip-end. To implement and tune the PID controller achieving the above requirements, without having a full test setup with external monitoring of all internal process variables as well as externally measured tip-end temperature, we really need to model the system being controlled, allowing simulations with near instant feedback.

I propose this plan forward:

  1. Create a feature branch, to allow experimenting with algorithms, features and testing without disturbing the main development path.
  2. Set up a simulation model, using the currently known variables and parameters and approximate the current system behavior. Can be implemented using a spreadsheet, or tools like MatLab/Maple. I prefer to use a widely available, preferably free, tools to allow future reuse by anyone.
  3. Perform model simulations that will provide one or more algorithmic options as well as documented behavior with changes to system conditions, as well as suggested algorithm parameters, including proposed gain factors kP, kI, kD.
  4. Implement/update the algorithm described by the model using the parameters obtained by simulation
  5. Perform tests using extreme system conditions (max/min tip mass, max/min voltage, max/min target temperature), and compare behavior with results from model simulations.
  6. If simulated and real system behavior differs significantly, adjust the model to achieve a better match and perform a new simulation -> implementation -> test cycle (GOTO 3.)

@dhiltonp and @Ralim what are your views on this plan? I'm quite happy to do the majority of the work myself, but you would need to do testing, and possibly assist in setting up the tool chain (I have not yet set up the development environment).

I have just ordered a second TS100 for testing purposes, which I will hack (somehow), to be able to monitor/record internal process variables as well as actual tip temperature, using one or two thermocouples (K type TP-01 with minimal thermal mass) allowing the actual tip temperature to be measured (mid-tip and end-tip), which will allow us to improve and validate the accuracy of the tip-end-temperature estimation algorithm, and it would also allow easier testing of future controller enhancements, such as #444.

Sponsored hardware I wonder whether we will be able to obtain sponsored hardware from Miniware, or other sources, for this work. They do benefit from the availability of this project, that exclusively runs on their hardware. Have you ever tried, @Ralim?

dhiltonp commented 5 years ago

Yeah, we disabled the D term as it was too sensitive to noise and disturbance. Any D term that could respond to changing temps would cause a much larger change in temp than was measured. Modeling the thermal behavior could make it usable.

The I term is defined differently than you expect, but is still an integral over a past window. Using milliWattHistory instead of a longer temperature window has 2 effects: it reduces computation, and clips the I term to be greater than 0. The clipping isn't bad as we can only heat, not cool. It means we undershoot less than we otherwise would.

I would categorize the current algorithm as a PI controller, with the thermal model built in to the code instead of pre-calculated. It could be rewritten in a more traditional form and you are welcome to do that - or not!

The PID algorithm from August may be more in line with what you expect, though that has its own issues.

I love the goals you have listed. I'm not too worried about the end implementation, so long as the implementation is clear and provides measurable benefits :)

With respect to workflow, just create a fork - no branch on @Ralim's repo is necessary. I'd love to see the model you come up with!

ldecicco commented 5 years ago

Hi guys, being a control guy myself, I have found a bit difficult the PID code to read, mainly because the implementation seems to be quite "unconventional". Is there an anti-wind up scheme implemented (did not find it)? This is crucial when the actuation variable gets saturated.

dhiltonp commented 5 years ago

The milliwatt output is clipped at 0 when temp is dropping, preventing negative wind up, and there is positive wind up but that's mostly ok - some overshoot is desirable because of the thermal lag.

hattesen commented 5 years ago

I honestly believe, that a conventional PID control with anti-wind-up on the I term would be quite adequate for this soldering iron. The control is no more complex that what PID controllers are used for in millions of installations during the past century.

Yeah, we disabled the D term as it was too sensitive to noise and disturbance. Any D term that could respond to changing temps would cause a much larger change in temp than was measured. Modeling the thermal behavior could make it usable.

My guess is that the D (rise rate) term is noisy due to a combination of the ADC LSB noise, and using a very short delta-time for the measurement. Measuring D rate once a second would be sufficient for it to be useful. Alternatively, one would use a time-weighted average when measuring D.

The current average recently applied power" compensation would have a similar effect to the D term, but instead of measuring temperature rise rate directly, it measures applied power, which would be somewhat proportional. The noise figure would obviously also be enormous for the measured power, unless averaged over a longer term, like it is done currently. I feel it would be better to use the actual measured temperature rise rate rather than to rely on an indirectly tied parameter. An easy way to avoid I term wind-up, is to keep it at zero, as long as the output is saturated (100% power). That way, once the P and D terms on their own start to reduce the power, the long term temperature offset starts to be summed up (in I). It is a lot less compute intensive to use the traditional I term than the current rolling-average-of-recent-power. It requires adding a temperature delta (instability is not a problem) to the integral (I) variable.

No need to do any other special "clipping" or parameter management, other than that.

Until it is proven, that the regular PID cannot be tuned sufficiently responsive and stable. At that point, we should find out the root cause.

The only caveat in this whole equation is the reason this issue was created in the first place. The measured temperature is not equal to the temperature of the soldering tip. The actual tip-end temperature (which is the one that should ideally be regulated) lags behind, about 5 seconds, changes to the heating element core, that is used for temperature measurements. However, until a solid, responsive and stable controller algorithm has been achieved, it is futile to think, that we can tweak it to take the thermal lag into account.

ldecicco commented 5 years ago

I honestly believe, that a conventional PID control with anti-wind-up on the I term would be quite adequate for this soldering iron. The control is no more complex that what PID controllers are used for in millions of installations during the past century.

+1

Yeah, we disabled the D term as it was too sensitive to noise and disturbance. Any D term that could respond to changing temps would cause a much larger change in temp than was measured. Modeling the thermal behavior could make it usable.

My guess is that the D (rise rate) term is noisy due to a combination of the ADC LSB noise, and using a very short delta-time for the measurement. Measuring D rate once a second would be sufficient for it to be useful. Alternatively, one would use a time-weighted average when measuring D.

The current average recently applied power" compensation would have a similar effect to the D term, but instead of measuring temperature rise rate directly, it measures applied power, which would be somewhat proportional. The noise figure would obviously also be enormous for the measured power, unless averaged over a longer term, like it is done currently. I feel it would be better to use the actual measured temperature rise rate rather than to rely on an indirectly tied parameter.

+1 also on this.

An easy way to avoid I term wind-up, is to keep it at zero, as long as the output is saturated (100% power). That way, once the P and D terms on their own start to reduce the power, the long term temperature offset starts to be summed up (in I). It is a lot less compute intensive to use the traditional I term than the current rolling-average-of-recent-power. It requires adding a temperature delta (instability is not a problem) to the integral (I) variable.

This is actually not so clean (it's not the proper way to implement anti-wind up). There are several schemes, the simplest one is conditional integral meaning that you stop integrating (but you do not reset it to zero) when the output is saturated. Among others, the back-calculation is the easiest to implement and elegant but it adds another gain (K_aw) to tune (even though rule of thumb do exist once you have a decent tuning of the PID gains).

No need to do any other special "clipping" or parameter management, other than that.

Until it is proven, that the regular PID cannot be tuned sufficiently responsive and stable. At that point, we should find out the root cause.

The only caveat in this whole equation is the reason this issue was created in the first place. The measured temperature is not equal to the temperature of the soldering tip. The actual tip-end temperature (which is the one that should ideally be regulated) lags behind, about 5 seconds, changes to the heating element core, that is used for temperature measurements. However, until a solid, responsive and stable controller algorithm has been achieved, it is futile to think, that we can tweak it to take the thermal lag into account.

There are more sophisticated mathematical tools to take into consideration this issue, but you need to identify the model of the soldering tip system.

dhiltonp commented 5 years ago

It's great that so many people with control experience are chiming in!

I'd like to make sure we're on the same page.

The algorithm in the code you've been looking at is newly released in 2.06.

@ldecicco, @hattesen, @doegox - can you confirm that you have tested the new algorithm in 2.06? While it may not be what you are used to looking at, it is a pretty solid PI controller and regulates the heating element temperature quite well (though I am a little biased, having designed/implemented it). I am open to any improvements you all have - including fully replacing the algorithm.

The discovery of thermal lag and realizing that the element temp can be very far from tip temp came after this new implementation.

If the tip is touching a heat sink, the tip temp can consistently be 100C lower than the heating element temp forever. The 6-7s lag is a simplification as the tip temp asymptotically approaches some value given a fixed power output. Actual testing of that curve should probably be done - I don't recall the values I observed well enough, and measurements perturb the output...

Again, the D term is useless for now. If you take action on the D term temp goes crazy and not just due to noise. The thermocouple can very quickly heat up 30-40C (I think, this is from memory) when very little energy has been delivered to the tip. Then algorithm cuts power because of the overshoot, and suddenly the temp is back nearly where it was before the D term kicked in. That drop of course would also trigger the D term but:

@hattesen, I suspect you are right about sampling frequency. Currently we are sampling the tip temperature very frequently (in the khz range? - @Ralim, can you confirm?) and averaging. It may be better to keep a similar on/off ratio (80%/20%), but sample every 1/2 second giving the sensor .1 second to stabilize after driving the tip. That may make the D term usable.


One note on the I term accumulation - the rolling average is actually very lightweight to compute (maintain a sum, subtract oldest value, add newest value). One thing to worry about (maybe you have an easy solution) in the proposed sum solution is saturation of the I variable. I'm not sure if 64 bit ints are available on the hardware and am unable to check - again, @Ralim, can you verify int64_t is available?

Ralim commented 5 years ago

@dhiltonp

@hattesen, I suspect you are right about sampling frequency. Currently we are sampling the tip temperature very frequently (in the khz range? - @Ralim, can you confirm?) and averaging. It may be better to keep a similar on/off ratio (80%/20%), but sample every 1/2 second giving the sensor .1 second to stabilize after driving the tip. That may make the D term usable. Currently the sample is performed at the end of the current PWM cycle, so there is always a lag to the measurement (when it finishes measuring it starts outputting the result that the prior PID loop computed, so there is always at least one PWM series delay. I believe measurement is around 30Hz or so.

There is a trade off here, where we cant easily say "Run PWM 10 times and then take the sample" as the ADC is triggered by the end of PWM pulse. Its doable to slow down the PWM rate (ie it counts slower) which will achieve this, however considering there is at least 1 sample delay, i didnt want to run this too slow or else the delay can impact the system. This also does not help in removing noise much in the measurements, as noise is mostly bounded by hardware limitations, and the current timings already avoid the ending tail of the op-amp recovery on most units fine.

One note on the I term accumulation - the rolling average is actually very lightweight to compute (maintain a sum, subtract oldest value, add newest value). One thing to worry about (maybe you have an easy solution) in the proposed sum solution is saturation of the I variable. I'm not sure if 64 bit ints are available on the hardware and am unable to check - again, @Ralim, can you verify int64_t is available?

64 bits is available, but its a software implimentation by the compiler (only 32 bit integer unit), so it is 4x as slow ( off memory ), so generally I prefer to stay in 32 bit world, but doable if its a small amount of maths fairly easily.

Also to note, I have tried my own PID twice, and used miniwares, and all of them have performed worse than @dhiltonp 's implementation. (I'm not great at PID's, but I found it rather hard to have one that compensated for changing environments as well as the current 2.06 one).

dhiltonp commented 5 years ago

Thanks, @Ralim.

Currently the sample is performed at the end of the current PWM cycle, so there is always a lag to the measurement (when it finishes measuring it starts outputting the result that the prior PID loop computed, so there is always at least one PWM series delay. I believe measurement is around 30Hz or so.

So the PWM is basically in lockstep with the PID loop?

Ralim commented 5 years ago

More, the reverse. Timer generates the signal for the ADC, and the ADC completion triggers the PID. :)

hattesen commented 5 years ago

An easy way to avoid I term wind-up, is to keep it at zero, as long as the output is saturated (100% power). That way, once the P and D terms on their own start to reduce the power, the long term temperature offset starts to be summed up (in I). It is a lot less compute intensive to use the traditional I term than the current rolling-average-of-recent-power. It requires adding a temperature delta (instability is not a problem) to the integral (I) variable.

This is actually not so clean (it's not the proper way to implement anti-wind up). There are several schemes, the simplest one is conditional integral meaning that you stop integrating (but you do not reset it to zero) when the output is saturated. Among others, the back-calculation is the easiest to implement and elegant but it adds another gain (K_aw) to tune (even though rule of thumb do exist once you have a decent tuning of the PID gains).

@ldecicco in this case, the controller will only ever reach 100% power during heat-up (and cool-down), and after a large change in temperature the I(integral) term would be useless, which is why I propose clearing the I(ntegral) value whenever the output is at 100%, and also when the output is at 0%, which would occur in cases where the set temperature has been reduced by a large amount (e.g. from 300°C to 200°C). My experience is that it is, by far, the simplest implementation, and that it is sufficient in systems such as this, which operate in steady-state with a power of approximately 10%. But if it turns out to be a limiting factor, I'm all in for using a more advanced anti-wind-up scheme.

hattesen commented 5 years ago

While it may not be what you are used to looking at, it is a pretty solid PI controller and regulates the heating element temperature quite well

As I have stated previously, I don't think the current algorithm is a PI controller. It is rather a P controller with a derived D term, so a P(D/2) controller. The compensation for the power using recently applied power will reflect the temperature rise rate, in a similar way to the D(erivative) part of a PID controller. It does not compensate for long term temperature offset, which is what the I(ntegral) term of the PID handles.

By not using a standard PID algorithm, we are unable to use any of the PID tuning algorithms that is the result of a lot of research and real-life testing.

I cannot see any reason why a traditional PID controller would not be at least as good at controlling the soldering iron tip temperature. Why previous attempts at tuning the former PID algorithm have failed, I cannot say. It may have been due to lack of experience with PID tuning, or a failure to calculate a proper, stable D(erivative) value for the controller to use.

In any case, I will not be able to contribute much unless we use a PID controller, or any other well-documented control algorithm that has proven its worth (I am not aware of any that approach the PID in versatility).

dhiltonp commented 5 years ago

@hatteson, can you confirm you have tested 2.06?

Again, you are welcome to implement a standard PID controller, and I'm not sure what the hold up is. Do you need help setting up sw4stm32?

dhiltonp commented 5 years ago

Let's say we know approximately how many watts it usually takes to maintain a given temp - maybe 1W for 150C, 3W for 250C, 6W for 350C.

If the output wattage and temp are consistently 15W and we're maintaining 250W, we know that there is an extra draw of 12W on the system. That marginal draw should tell us how much we need to boost the power to get the tip temp up to 250W.

We can replace the temp display with the estimated tip temp and verify it matches an external thermocouple as a first pass.


Here's some other stuff to determine:

What are the curves for the ts80 and t12 tip types? Do they exhibit similar temp drops against a heatsink, or are they better behaved than the ts100 tips?

Is the temp offset consistent with different voltage inputs and the same output wattage, or does the extra PWM delay in a high power system cause the temp to read lower or more accurately?

hattesen commented 5 years ago

Here is the current state of my heat modeling of the TS100. It contains all the variables that are required to determine the optimal Kp, Ki and Kd gain factors for a standard PID controller with a little filtering on the Derivative component. The Process Value (PV) used as input to the controller (current temperature) is an estimate of the tip-end temperature, given the parameters of the individual tip (heat capacity, transfer rate og heat from tip-core to tip-end etc).

Before the simulations and parameter tuning of this model can be used in earnest, this theoretical model really need to be compared to the physical TS100 (and TS80) to confirm the estimated parameters of the heat transfer model – ideally for each tip. This ideally requires (at least two) K-type thermocouples attached to the soldering iron tip, or alternatively a FLIR camera, logging the temperature changes occurring during a set of test conditions.

I have attached two screenshots of a simulation containing three phases:

  1. Heat up from 100°C to 280°C (time: 0 - 11 sec)
  2. Stable temperature at 280°C (time: 11 - 20 sec)
  3. Disturbance. Simulated soldering, which cools the tip-end down for 2 seconds. Note that this causes the (actual) tip end temperature to drop sharply, but this cooling is only slowly detected as the heat energy is drawn from the tip-core, causing extra power to be applied. (time: 20-22 sec).
  4. Recovery from disturbance (time: 22 - 24 sec)
  5. Set Point (target temperature) is changed from 280°C to 350°C (time: 30 seconds)
  6. Heat up until target temperature is reached (time: 30 - 38 seconds)

Chart illustrating the result of the simulation

ts100_ts80 temperature control using pid control 1

Chart elements:

The operating conditions used in the simulation:

image

I can quickly provide any number of simulations using different conditions, showing the effects when changing:

Obviously, this model does NOT reflect the current controller implementation, but rather a standard PID controller. However, all the elements required to implement the PID algorithm is available in the model, and a side benefit is an estimate of the tip-end temperature, based on a physical heat model.

I am currently looking into getting my hands on a thermocouple/PIR data logger, as well as a new TS100 which I can use to compare the model with a real life sample.

hattesen commented 5 years ago

Interesting knowledge gained by modeling the thermal control of the TS100

Sampling period

Interestingly, given my current model of the TS100, I have discovered that the sampling period (how often is the power output recalculated) can be safely set to 200 ms, without getting any instability. Signs of instability do not show until the period is extended above 400 ms.

Using a sampling period of 200 ms... ts100_ts80 temperature control using pid control 5

Using a sampling period of 500 ms. Notice the circled power spikes – but no noticeable temperature instability. ts100_ts80 temperature control using pid control 6

PWM resolution

The resolution of the PWM (how many duty-cycle levels are supported) is not critical at all. Having just 3 levels (0%, 50% and 100%) does not cause any significant deterioration to the temperature control. Even having only 0% and 100% results in a decent control.

Using... (0.4% PWM resolution = 256 levels) image ts100_ts80 temperature control using pid control 2

Using 0%/50%/100% PWM duty cycles, only: ts100_ts80 temperature control using pid control 3

Using 0%/100% PWM duty cycle... ts100_ts80 temperature control using pid control 4 The tip-core temperature oscillates by 5°C, but the tip-end temperature varies by only 0.3°C.

Kubuxu commented 5 years ago

I think the primary factor here is core/core-thermocouple to tip thermal resistance. Simple measurement for it would be to submerge the working area of a tip in almost boiling water (accurate 100C environment) and drive the heating core with constant power. The observation then is the temperature of the core at a given power.

The test should be conducted for at least 3 power settings (a few more would be even better). This would allow us to model the tip temperature depending on core temperature using a simple model.

I will conduct this test (and many more) when I have more equipment available.

Kubuxu commented 5 years ago

For more complex tests (dynamic, instead of steady state) we need temperature time series. My idea for data dumping: microcontroller in i2c slave mode dumping data to serial, connected to ts100's i2c bus.

Another trick is using SPICE for thermal modeling. In short: Voltage [V] => Tempearture [K], Current [I] => Power [W], Resistance [V/A, Ohm] => Thermal Resistance [K/W], Capacitance [As/V, F] => Thermal Capacitance [Ws/K]

Paper on the topic: https://www.infineon.com/dgdl/Thermal+Modeling.pdf?fileId=db3a30431441fb5d011472fd33c70aa3

Kubuxu commented 5 years ago

@hattesen can you share the spreadsheet. I would like to compare it with a model I'm developing in LTSPICE.

dhiltonp commented 3 years ago

726 improves temp estimation. It's basic and hasn't been tested under instrumentation but initial results look really good :)

discip commented 2 years ago

@dhiltonp & @Ralim Do we still need this, after @sandmanRO s contribution https://github.com/Ralim/IronOS/issues/1038? Or am I missing something here?