gergelytakacs / AutomationShield

Arduino library and MATLAB/Simulink API for the AutomationShield Arduino expansion boards for control engineering education.
Other
37 stars 16 forks source link

FloatShield: MATLAB API and Examples #171

Closed gergelytakacs closed 5 years ago

gergelytakacs commented 5 years ago

This issue will discuss the development of the MATLAB API for the FloatShield and the accompanying examples and test results.

issue-label-bot[bot] commented 5 years ago

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

gergelytakacs commented 5 years ago

@PeterChmurciak I have seen your API which looks really good and I am glad and proud of you to have managed to get it worked with the sensor, good job.

Now it would be cool to see it work! So what I would expect is that you take a look at Heat and try to replicate

Now this may be a teeny-tiny bit tricky, as a USB connection is truly horrible for anything "real-time". You cannot expect to go too fast, but keep experimenting. It would be great to see some test results here. One aspect is that I'd try to finish the PID example for the Arduino API before I attempt to have a go at the MATLAB one. The reason for this is, that (logically) the two should have the same tuning and produce comparable results. Now this might be not so easy for the reasons mentioned above (e.g. sampling), so, even though it is worth a shot, its not a tragedy if you use different sampling and or tuning across the various interfaces.

PeterChmurciak commented 5 years ago

@gergelytakacs I have managed to create closed loop PID example as mentioned in ea39552521d1a2fdcf92f4936bdc07e4cfa8acc0. Here are two results that I got out of this example:

Using section length 1000 with sampling 25 miliseconds - that means that the system has 25 seconds to adjust to level in reference trajectory. The trajectory has 10 levels so the experiment took about 4 minutes to complete.

4min

And using section length 2400 with sampling 25 miliseconds - that means that the system has 60 seconds to adjust to level in reference trajectory. The trajectory has 10 levels so the experiment took 10 minutes to complete.

10min

Note that the time it takes for the system to stabilise at reference level is sometimes pretty long. There are some overshoots when rapid reference change occurs but I was not able to get rid of them. Also sometimes, very rarely the sensor reading seems to show incorrect values like on the right end of second picture. Please tell me if you have any ideas how to improve the results, the PID tuning or the example itself. Also I would like to ask what do you mean by open loop identification example ? HeatShield has three identification examples Blackbox/Experiment/Graybox. Should it look like one of these ?

gergelytakacs commented 5 years ago

@PeterChmurciak

Just to make it clear, I assume these are Arduino IDE programmed results shown in MATLAB.

All right, some of these results seem quite decent, especially when I take a look at certain portions of the graph. I think you should try to aim to improve these, if you can. At this point, I would I do not have definite answers on how to do that only some recommendations and things I would possibly try:

As to your last question: I think the primary goal is to learn for you. So I'd say you look into the MATLAB System Identification Toolbox and all HeatShield examples to learn a bit about system identification. In case of "SIT" just go into MATLAB help and do the examples/tutorials, they are actually very good! That's how I started to learn. Some other info in a nutshell:

PS: Also, you might want to post data (or the FIG) along with screenshots (just drag and drop) so I can inspect it from up close:)

PeterChmurciak commented 5 years ago

@gergelytakacs Actually, they were not. They were results of MATLAB closed-loop example. The matlab closed-loop PID example is identical to Arduino PID example with only difference being the sampling method. As you mentioned in https://github.com/gergelytakacs/AutomationShield/commit/ea39552521d1a2fdcf92f4936bdc07e4cfa8acc0#r34262734 I assume that matlabs tic/toc is not reliable sampling method. I did not know that, and will use Arduino IDE for getting more serious results. I repeated this experiment using Arduino IDE, captured with CoolTerm and plotted through Matlab:

Using section length 1000 with sampling 25 miliseconds:

4min1000try3

And using section length 2400 with sampling 25 miliseconds:

10min2400

The results were more or less similiar.

To answer your question about whether that is sensor noise, this is measurement of not moving ball (tube was oriented horizontally so the ball would be somewhere in the tube without using fan):

NoiseTest

As you can see, there is only very little sensor noise. What seems to be the cause is horizotal ball movement - because of the turbulent flow and the fact that the ball is much smaller than the tube. This is ball at a stabilised position:

movingBall

And the measurement looks like this:

BallMovement

The ball oscillates horizontally and because of its round shape the sensor reads different distances. This unwanted noise probably could be solved by using ball with tighter fit.

I will try to improve these results as you suggested, first by reducing the reference jumps.

PeterChmurciak commented 5 years ago

@gergelytakacs By reducing the aggressiveness of reference jumps I have managed to get these results:

Using section length 1000 with sampling 25 miliseconds:

PID3min1000NewTrajecory

And using section length 2400 with sampling 25 miliseconds:

PID3min2400NewTrajecory

I am using different section lengths to show that the system will stabilise pretty nicely if it has enough time.

These results look pretty nice, but there is still the noise from the dancing ball. What do you think ?

gergelytakacs commented 5 years ago

Thank you for your analysis on the sensor noise / ball movement issue. So it's not the sensor, it's the ball.

Now this dynamic is of a higher order - e.g. with a higher frequency. Just imagine a differential equation describing the behavior of the ball, where we manage to control the low order terms (slow) but don't take care of the high order terms (fast). My guess is that this won't be any better without serious changes in algorithm or hardware, but we don't want that at this moment - we just want a nice PID demonstration.

As of which API to use to get the results. I strongly urge you to experiment with the Arduino API PID example, since we know that the sampling (and possibly other things) are reliable there. As soon as we manage to agree how to present the results, you can repeat the experiment for MATLAB, just to have it as a reference.

At the time, I'm not really sure which experiment I will like the most and which ones will be the most presentable, so I'd tell keep on experimenting and post some (visual) results here for us to choose from. For the small reference changes in here I think the second one is definitely better. You need to give more time to settle for the dynamics. So I recommend sticking with ~2400 samples = roughly a minute (?) for the dynamics to settle. Now as for the "noise" this is a bit ugly. However the overall position follows the reference nicely. What is interesting is that in your previous comment you have managed to get some sections with significantly reduced noise, and despite the over-undershoot situation I like that. It is interesting that in bot trials there is a lot less noise at the beginning... I'd try to make small changes but more at the lower portion of the tube, to see what the controller does. Try to play around with the reference profile to see what produces the cleanest response. TLDR; I cannot decide between large and small changes in reference so far and I'd like you to experiment more.

Remember to document and save all the results (the raw data), so we can later re-use them in your thesis and in a publication: writing

PeterChmurciak commented 5 years ago

@gergelytakacs Alright, I will use exclusively Arduino API to get my results.

Yes 2400 samples should be exactly one minute at 25milisecond sampling, I agree that it looks better with this amount of time to settle.

When trying references at lower portion of the tube as you suggested, I have noticed an anomaly in sensor readings. I have noticed it before but thought that it was an accurate measurement and the ball just behaves like that - when the ball is rising from bottom position, there is a little "jump" in reading:

These are system responses to step 0-100% power, notice the area between red lines:

StepResponses

This is measurement of raw distance between ball and sensor (inverted y axis - max y value means the ball is at the ground) during movement with more or less constant speed, again "jump" when rising:

ContinuousStringPull

To test it more thoroughly i have used dental floss, paperclip, ruler and the cotton ball with little hole in it:

TestingApparatus

Detail: TestingApparatusDetail

I have used the string for controlled movement of the ball. Using this method I have done two tests. In first I have moved the ball 1 centimeter at a time with approximately 10 seconds pause after movement for the reading to "stabilise":

StringPull1cm10secPause

The ball was being moved to the right - closer to the sensor, so the distance should have been decreasing. However there is region, where the measurements showed increase in the distance instead, namely samples between 6000 and 10000.

The second test was same but with different waiting period, it used 30 second pause after movement:

StringPull1cm30secPause

Again, when getting closer to the sensor with the ball, around 26 centimetres some anomally occurs and lasts until it gets as close as around 16.5 centimetres. This is weird, and I guess should not happen ?

To make sure I have tested the second unit I have at home - the one with insert. Its step responses looked:

StepResponsesUnit2

And measurement of raw distance between ball and sensor during movement with approximately constant speed:

ContinuousStringPullUnit2

It had slightly different measurement range, but that was to be expected because of the insert and other differences in tubes. The "jump" was not there, or at least not very significant. I have done the first dental floss test also on this tube, and its result looked like:

StringPull1cm10secPauseUnit2

That is almost linear, as it should be (I was pulling the string by hand and using ruler as a reference... there was plenty room for error but it should look at least linearish).

The last thing we need in this non-linear system is more non-linearities... It might not be that big of a problem, but it adds to unexpected behaviour in certain region and affect accuracy of results in this region. It is region from ~35% where the rising part suddenly starts to descend:

Tip

To (based on the graphs, it normalizes when getting as close as around 16.5 cm to sensor) ~65%.

This is not problem of library - I have tried both Arduino and Pololu and the results were the same. What this might be is probably... faulty sensor unit - because it happens only on one of two sensors I have at home.

TLDR: Unlucky me possibly picked another faulty unit, again with broken laser sensor.

What this causes is that, when the reference is set to 45% and the system gets the ball to %45, it is not really at 45% height of the tube (if you would measure it by ruler or something) but somewhere where the sensor currently sees the 45%.

I am now thinking about switching the sensors between the two units, removing the insert or coming to school for different one... Again...

gergelytakacs commented 5 years ago

@peterchmuciak

Thank you for the extensive tests. I agree with you, this seems like some sort of hardware bug, and for your "reference" tests published in a paper or your thesis you should use a pristine faultless unit.

For "normal" student use I don't think this is such a big deal though. But it is very interesting. Another factor is that the sensor has a 25 degree Field of View (FOV) and I'm wondering whether the when the top 3D printed holder lets the sensor shift around has any effect on this. Anyways, this is just thinking out loud.

TLDR; Use a faultless unit for "reference" tests.

PeterChmurciak commented 5 years ago

@gergelytakacs Today I have tried to fix the problem, and well... It was quite troublesome.

I started by swapping the sensors of the two units I have at home, as I mentioned at the end of my last post. When I swapped them, to my surprise the problem persisted, but weirdly still on the same unit. I swapped them multiple times to make sure but the sensor did not seem to be at fault.

The next suspect then were the connectors between sensor and board - maybe they were somehow damaged, so I took apart the units and swapped the cables they used. The problem was still on the same unit - the one without insert - the one that I wanted to use. After I examined the connections and found no visible problem, I started suspecting the board - possibly there could be something wrong with the shield itself so I tried all possible (maybe even impossible) combinations of connecting each sensor with each board and trying if there is problem. It seemed inconsistent - sometimes the problem appeared and sometimes there was none at all. It almost drove me crazy...

I put together the units, tested the first - had no problem, tested the second - had problem, so possibly this shield is faulty (the one without insert) - I took them apart again and swapped the 3D printed bottoms to get rid of the insert on my testing unit and put them together. Well... The one that was now without the insert had the problem while the one with insert did not. Unbelievable...

I tried swapping the upper 3D printed parts, the tubes, balls - basically everything important, but nothing helped. I then tried moving the sensor to replicate the problem - try to make the "jump" appear on the monitor and after some fiddling with sensor position and moving the ball up and down, I have managed to create the jump (on the insert-less unit) and also move the sensor to such special position that the jump would not appear. Logically, we would want the sensor to be approximately in the middle of the tube, but this special position was all the way on the periphery of the tube as the 3D printed holder allowed. This illogical position of the sensor removed the jump, but changed the range of readings and probably made the sensor read nonsense.

I then tried replicating it on the other unit (the one with insert) and as much as I tried, I could not make the jump appear here. What the hell ? Does it have something to do with the insert ? How can it be ? After contemplating for a while in utter confusion and despair, it struck me, like a lightning - the bottom of the tube, what is there ? This:

20190715_232544

The bottom of the tube is basically the fan, and as you can see, it has tiny little cute reflective sticker right in the middle of it. It surely is lovely, and the laser must think so too, because it likes to give it a visit sometimes, and when it does, it stops doing its work properly and shows weird things at the Serial Plotter.

Engineering solution lvl9000 (did not want to simply remove the sticker as it has some info on it):

20190715_233845

Using some black tape, I have managed to break this toxic relationship between sensor and the sticker and the problem with the innacurate readings in some areas has miraculously disappeared and could not be replicated manually anymore.

After that I have tested the same PID test as in https://github.com/gergelytakacs/AutomationShield/issues/171#issuecomment-510685182 - used exactly same settings and got:

10min2400

That looked pretty badly to me, but I have used different ball than before, so I tried using the same as before (same material but slightly lighter and more round - more perfect sphere) and got:

10min2ndBall2400

This looked much better and it only shows that the ball and its properties are very important part of this system.

As you can see the plots look very different than in https://github.com/gergelytakacs/AutomationShield/issues/171#issuecomment-510685182. Now there is much less of that noise, that I probably wrongly blamed on ball oscillation - it still oscillates now, but on the plots it is not as much dramatic as before. The best explanation that I can come up with, is that in said position range, between 35% and 65%, the laser beam from sensor could possibly reflect from the wall or through some other means travel past the ball, reflect from the sticker and interfere with the sensor reading, causing that noise. Now that the fan surface is no longer reflective, there is highly reduced or non existent reflected beam noise. The unit with insert does not have this problem because the sticker is hidden under the insert itself.

If this is true, and I surely hope it is, (because seeing that my research https://github.com/gergelytakacs/AutomationShield/issues/171#issuecomment-510974987 was mostly useless hurts) then probably all of the FloatShield units might have this problem, and it can be solved similiarly.

For the fun of it, I have tried the previous reference trajectory - the more aggressive one (as in https://github.com/gergelytakacs/AutomationShield/issues/171#issuecomment-510572587) and this is what I have got:

OldTrajNewUnit

As you can see, the plots look more nicely now. It seems that the system behaves a bit differently, possibly because that underlying problem has disappeared. I am even wondering if the PID constants could be now improved manually a bit and I will surely try to improve them again.

gergelytakacs commented 5 years ago

@PeterChmurciak

Hahaha, man, this is the best story I've read in days!:) Awsome find. Hats down, this is actually true engineering and troubleshooting as weird as it sounds. I guess you (and I) will never forget to check for stickers when using lasers!:))) Nice, man, this is an awesome find and solution!:))) By the way, just feel free to remove the sticker, that's fine. You may even use some solvent to remove residue. Also the information about the quality of the ball is excellent, it is good to know. I think you are learning a lot about what (not) to do in R2.

The results you have obtained and shown in your last comment are fantastic, there is no need to be any better at this point! I actually think this is not going to be much better by simple PID, but more about that later. I really hope you do have the raw files available, because I am very happy with how they look. You may create a couple more experiments, try to mess around with PID constants, then this PID thing (for the Arduino API) at this point isa as finished as it can be. Again, I must stress this, save raw data and plots, make notes. These are the first seriously presentable results.

Now that you have considerably improved the system, I think it's time to go back to gather some open-loop indentification data, that should be your next priority. So again, I'd like to see a plot with no control, output responding to input without saturation.

Good job, I'm proud of you!

Just some semi-random notes:

Rhough short to medium time priority list:

PeterChmurciak commented 5 years ago

@gergelytakacs I was trying different PID constant combinations but was not able to get better results, so until we can come up with something better, they will stay like this.

I am currently trying to create the open loop identification example that would show response of system to changing input train, but I am having some trouble.

For example when setting the fan to 58% output while the ball is on the ground, the ball will lift up and then stop around 4cm from the ground and stay there. When setting the fan to 60% output while the ball is on the ground, the ball will lift up to around 4cm from the ground, and then very slowly, with several short stops, rise up all the way to the top. - The input train changes should be then pretty small.

Additionally, when setting the fan to 58% output while the ball is already somewhere around the middle of the tube, the ball will rise up with several short stops all the way to the top - the effect of fan power output on the ball is dependent on the ball current altitude. - The input will produce different results based on the current ball position.

I am therefore not sure how to go about this, as there should be no control, only input and output, while also not achieving position saturation. How do you think we could solve this ?

gergelytakacs commented 5 years ago

@PeterChmurciak

PID

Okay, that's fine. There is a whole lot of things to try later, but for almost all of them we'll need a proper model. So that's why I told you to focus on gathering data.

ID Experiments

At this point I do not have a solid idea how you could extract good results. As I told you before, it is actually possible that this can't be realistically done. Theoretically the system is not open-loop unstable, but practically it just may be with all the currents and other chaotic phenomena. But, closed-loop identification might be troublesome later, at a stage when you are trying to fit the data to your model. (You have to include the controller in the dynamics as well! Then how do you separate the actual model from the controller? etc.) Nevertheless, we still may have to enter this unpleasant domain.... So once you say, this isn't going to work, I believe you, and we'll try to find other routes to get a usable model.

One thing you could try is to find a fan output with a reasonably good and stable % of power and see if the thing will stabilize itself there. Then, instead of doing jumps on the input (that look like the reference in the PID case) start to inject random noise onto the input signal. Experiment with its spectrum/distribution and standard deviation (variance) of its amplitude to see what's the largest noise you can inject there. Even if, say 5 s is stable, then it goes unstable - this data portion may prove useful later.

Another weird thing you may want to get a shot at is that you create a closed-loop PID control and attempt to use the recorded input from the PID controller as an open-loop input. This will probably not work (since the actual feedback is missing), but it might be worth to give it a shot.

the effect of fan power output on the ball is dependent on the ball current altitude

Exactly. This is nonlinearity in practice for you. If the system would be stable, a single input level with some noise (small changes) would be satisfactory as data. Here, especially that you know this, you should aim for the biggest change achievable in power while running open loop. Since the system is unstable in practice, this may be an impossible task.

Don't let this irritate you too much, if it's not gonna work then okay - you might actually trying to do the impossible. Still it's worth a shot at this stage.

PS: This conversation should be ideally in #167 actually.

gergelytakacs commented 5 years ago

@PeterChmurciak

[This is more connected to the original thread topic, not the above comment.]

Once you have decided upon a "reference PID response" in the Arduino API (see this comment) please try to test (and possibly compare) your MATLAB implementation by the same set of settings/sampling/references etc. This will be essentially again something presentable and "finished" and by this you can see if the MATLAB API works as intended. Once you have this, please post a screenshot of the response. If the responses (MATLAB/Arduino) are more or less the same, you can take this as something given and post me the raw data as well. We'll move on to other issues, the basic API part of the MATLAB is ready then...

PeterChmurciak commented 5 years ago

@gergelytakacs I agree with PS: in https://github.com/gergelytakacs/AutomationShield/issues/171#issuecomment-512203210, I will continue that conversation there.

I have somewhat finalised the PID examples, see 18f797f, and changed the trajectory a bit. The MATLAB example is exactly the same, with only one difference from Arduino example being the method used for sampling - interrupts/tic-toc.

The sampling used was 25 milliseconds with section length 2400 - that means one minute for one reference level to stabilise.

Result I got from Arduino example:

Arduino

Result I got from MATLAB example:

Matlab

The raw data: ArduinoMATLAB 17.7.2019.zip

As you can see the results are pretty good while being very similiar, almost identical.

Tell me if they are not good enough or you would like me to change anything.

gergelytakacs commented 5 years ago

@PeterChmurciak

These results are just perfect for the moment, I think we should move on to other tasks now. We shall get back to control issues after R2 is done,.

Only one quick thing to ask: Can you do a screenshot of the closed-loop process in the Arduino IDE Serial Plotter? This would be to illustrate that you don't need MATLAB at all to experiment. Now, I know that you can't plot it like these (several minutes) so maybe if a nice transition fits into the screen, that's enough... Try to do a couple of those, so you don't have to get back to this anymore. (PNG format, try ~4:3 ratio (doesn't have to be exact) and preferably large.)

PeterChmurciak commented 5 years ago

@gergelytakacs Sure thing, I have cropped them so it is visible that I am using Arduino IDE Serial Plotter. I have tried to make it 4:3 while also using biggest resolution and screen portion I could. I have done two sets of screenshots - one forcibly uses scale 0 to 100 and other lets Serial Plotter adjust the scale automatically. In the automatically scaled one the jumps looks bigger, and it is overall more dramatic, while in the forcibly 0-100 scaled one, the jumps are mild, but more authentic.

Arduino PID Screenshots 18.7.2019.zip

Have look at it and tell me whether it is good enough or I should do something differently.

gergelytakacs commented 5 years ago

@PeterChmurciak

These are excellent, thank you! I prefer the auto-scaled, there are good examples there.

Good news! The Arduino IDE and MATLAB PID examples are done!;) I suggest you continue with system identification and / or the Simulink API (as mentioned in other threads).

Noooiiiice work!;)

gergelytakacs commented 5 years ago

This is pretty much closed for now, except for identification though.