verify and show demonstration of WEC feedback controller

ryancoe commented 3 years ago

Compare the applied torque (or current) with the product of the controller gains, position, and velocity; check that

tau = kp*pos + kd*vel

@nickross4444 and Anwi collected data for this previously

The results will be included in Section 7 of the paper

ryancoe commented 2 years ago

Time history plot with three axis:

position
velocity
torque

Explain in text that the torque follows tau = kp*pos + kd*vel

nickross4444 commented 2 years ago

These are the resulting plots: The error is calculated based on the reported amperage from the motor controller, and the expected amperage based on position and velocity read from the encoder. The high error in the high P range is not concerning, since the P gain may just be too high, but the rest is. I am currently looking at seeing if filtering the data properly clarifies the issue.

ryancoe commented 2 years ago

@nickross4444 - Thanks for posting. Could you put the data and plotting code up in this issue as well?

nickross4444 commented 2 years ago

Most recent files: PDverification.zip

ryancoe commented 2 years ago

@dforbush2 @gbacelli - Could you take a look at this and advise @DeepFriedDerp @nickross4444 how to proceed?

dforbush2 commented 2 years ago

Last update I heard is that filtering did not solve the problem: I was wondering if the jump in current set-point at the gain switches was biasing the error plot high since the instantaneous error at that point would be high since there was no time for the electrical dynamics to catch up. That doesn't look to be the problem anymore. I will take a look in detail tomorrow, but generally poor torque tracking suggests that the internal torque control loop in the motor drive is not working well. Not sure if this is something you are tuning or if the parameters the drive is set up to use are just poor.

ryancoe commented 2 years ago

May also be good to recall the findings from the motor torque constant efforts (#59)

https://github.com/SNL-WaterPower/siweed/issues/59#issuecomment-863504663

dforbush2 commented 2 years ago

Did a deeper dive today and some observations:

% error dimensions as presented didn't quite line up right: its actually a lot worse than as displayed, but read on
I wasn't clear about how to implement that filter (sorry @nickross4444), so there was some phase-shift introduced that actually made the error worse in some cases. I used filtfilt() instead, but this doesn't help much: there are only about 10 samples per period (see point of concern, below) so there isn't a lot of resolution to filter within.
Some of the largest torque-error peaks occur right when the gain changes, amounting to an instantaneous change in command torque for which we would expect a large error, as suggested previously. But this does not explain all peaks or even the largest of them. A bigger reason is that some of the desired torque values are very close to zero. So when you take the percent error, even if you are off by a functionally tiny amount (i.e., within sensor noise), it shows up as a large percent error. I recommend using another statistical error measure: the "right" answer is up for debate.

torqueErrPerc

Here is a plot of absolute error compared to an estimate of sensor noise, estimated when the command torque is zero. torqueErrAbs

Note that the weaker, but still present, trend in absolute error near zero, is probably related to the issue @ryancoe noted because for the cases where Kp = 0 and Kd~=0, when the command torque is near zero (i.e., velocity = 0), due to static friction, the measured torque may not be. However, if this were a dominantly strong effect, I would expect that the absolute errors for[ Kd~=0, Kp=0] would be worse near zero than those for [Kp~=0, Kd=0] where the zeros of command torque do not align with the zeros of velocity.

As seen in the plot here, though, this doesn't totally account for it: there are also larger errors associated with velocity maxima when Kp gains are non-zero. These seem more related to a phase-lag issue in the controller, as the change in command will be especially large due to the high velocity in these cases. These errors correspond to the highest-amplitude torque commands in this data set.

All in all, I am not hugely alarmed by the torque tracking of the drive considering the high impact of friction on the measured torque value. But it can be improved: in particular there is a noticeable lag (~15 deg) in the larger amplitude torque commands. Do we have a way to adjust the internal torque tracking loop of the motor drive?

One point of concern:

Before the controller even moves off of zero gain, the logged time stamp interval basically doubles: does this imply that the control loops in question have had their sample times halved? In the pic, the bottom data tip is the last time the control gains are both zero.

dforbush2 commented 2 years ago

Notes from a discussion:

the bottom plot is just the GUI time-stamp: this does not affect the torque control loop which runs at a very nearly constant rate (~100 Hz).
the arduino read of sensor values are read synchronously: they are read at the same instant. Although the reported time stamps may not be the actual instant they were read, both the command torque and sensor torque will be from the same instant.

In short, both of my leading theories above are not the actual problem.

TO-DO @dforbush2 :

parse the absolute error plot for points we can "explain" the error removing: points below noise floor, those occurring in the moments after a step change in gain setting, and those that are potentially out-of-range of the motor torque. Hopefully, any remaining trend will be more illuminating.

TO-DO @nickross4444

on the hardware: try and get as consistent and fast a sample time in the logged values (slight prioritization of consistency over speed) and then perform the following tests: 1). Zero --> step change to positive torque --> zero, to reset --> step change to negative torque. Repeat, changing amplitudes each time. Try to get ~6 different amplitudes, the largest of which exceeds torque limit. 2). Zero --> step change to positive torque --> step change to negative torque --> zero. Repeat, changing amplitude of step size. Repeat again changing step order (i.e., negative to positive). 3). Zero --> ramp torque over ~15 s to a large, but below limit value --> zero. Repeat with oppositely signed torque.

Hopefully this will give us some more information! Leading hypothesis now based on data we have seen: the motor torque slew rate is too slow to provide good tracking for the larger amplitude oscillations. Whether that is a hardware limit or something that is a dynamic introduced by its controller remains to be seen.

nickross4444 commented 2 years ago

@dforbush2 Here are the test results. I believe test 3 ended up exceeding the torque saturation limit, but this can at least help us characterize exactly where that is. Open loop data.zip

ryancoe commented 2 years ago

Can we look at a plot of commanded torque vs. actual torque? Should look something like this. @stespenc83 did something similar in #59.

dforbush2 commented 2 years ago

Hi all,

readTestData.txt Please see the attached script to run this processing. I wish I knew more of what to make of it...but there is some weirdness. I'll spare attaching a ton of figures to requesting that the attached script is run and plots made. Change the extension to *m and extract the files from the previous post.

Some observations:

The effect of static friction is evident in all of the step tests.
The sustainable torque limit is 3.5e-3 N-m. The higher the torque over this limit, the lower the time it can be maintained.
There's some assymmetry in the static friction. The alternating steps shows this most significantly at the largest steps.
In the ramps, one can see that there is consistently about a 2 to 3 sample delay in torque measured matching torque command. This aligns with the sine wave tracking ability in the samples noted above.
The diff(sample) is now much more consistent, at about 32.
The tracking error doesn't seem to be a function of step size, but tends to increase with larger torque ramp rates.

To my eye: if we have the ability to adjust the internal torque tracking loop, these, in combination, suggest that we should increase the integral gain of this control loop.

nickross4444 commented 2 years ago

@dforbush2 Is there a graphical way for us to confirm that the PD controller is behaving as expected, but with a 2~3 sample delay? If so, that is probably good enough for documentation in the paper. I dug and little and didn't see any way we could control the fine parameters of the control loop within the motor controller.

dforbush2 commented 2 years ago

"As expected", in my view, does have some delay, but maybe not 2-3 samples worth. Another explanation here is that there is some compliance and inertia in the system between the motor and the load cell that account for the apparent delay (this would need to be a relatively large inertia and a relatively small compliance, based on the behavior seen in the step tests.

Plot command torque vs time and measured torque vs time on the same axis for some of the ramp tests, or even some of your (non-saturating) sinusoids. You could maybe show a third plot that is plot(time(1:end-2),measured_torque(3:end)) which will plot the measured torque lagged by 2 samples, which should be close to command torque. I'd also be curious what a plot of average error (just error, not percentage) during ramp vs. ramp rate looks like.

nickross4444 commented 2 years ago

Here is the resulting plot, which to me looks like we have a two sample delay: There are some outlying error points not shown caused by the step change. I think there might be something else going on too causing the error, but I'm not sure. Here is the script used to generate these plots: SampleDelayTest.txt

dforbush2 commented 2 years ago

Great. Is there a way to indicate the noise floor of the sensor in the right hand plots? You can look at torque measurements over an interval where the torque command was zero. Looking at a min/max or some percentile range (5% to 95%, for example) would give us a good idea of the noise floor. Points that fall within that band are explainable due to noise, points that fall outside of it are worthy of attention. Be good to look in both directions, too, (i.e., the downward torque ramp) to ensure that the behavior we are seeing is symmetric.

nickross4444 commented 2 years ago

I used this percentileCalc.txt script to find the noise band(middle 90%) to be: min: -3.9781e-05 max: 4.3755e-05

I'm not sure why these aren't symmetric. Plotting these values as lines with the previous script(I'll attach the updated version) gives the following:

SampleDelayTest.txt

Looking at the other test runs looks similar. I also had the script count how many of the data points lie within the noise band, for each shift tested, which gives these results:	Shift amount	% of points in noise band:
0	78.1627
1	79.5311
2	80.0028
3	78.1685
4	76.0068

This isn't a great test since it counts all points, including the long sections of 0 command before and after tests, but it does show this:

a shift of 2 samples is numerically the most accurate discrete expression of the system
non-zero commands increase the error rate outside the noise floor(otherwise we would expect 90%, the threshold the noise band was set with)
more than 10% of non zero torque data points fall outside of the noise band. In otherwords, some factor other than noise or sample delay is causing >10% of samples to track inaccurately. By how much I have not characterized. That OR all points are offset slightly, pushing >10% outside the noise band in addition to the expected 10%

@dforbush2, please let me know if this interpretation seems correct, and what the next action should be.

ryancoe commented 2 years ago

Plot with four axes: position, velocity, gains, torque (both commanded and actual)
Characterize noise by statistical analysis of zero-command torque data:
- Plot a histogram
- Find the mean (hopefully zero) and standard deviation
Autocorrelation to find delay (should be the same two samples that you've found), see, e.g., https://www.mathworks.com/help/signal/ug/cross-correlation-of-delayed-signal-in-noise.html?searchHighlight=finddelay&s_tid=srchtitle_finddelay_2

nickross4444 commented 1 year ago

Finished in previous comments
It seems the mean of the noise is not zero: Commanded torque is read through an analog output if I remember correctly. My best guess is there's a small DC offset in the generation or reading of the analog signal. This could potentially be calibrated for. These are histograms of the commanded torque when the command should be 0(either by pwm commanding the duty cycle that corresponds to 0 torque, or by the enable pin being low[both data sets individually show similar results]): The means are 2.3676e-05 and 2.0934e-05, and the standard deviations are 1.6564e-05, 1.7211e-05, respectively. Units are meters.

These were found with this script: pdverification.txt

The methods of cross correlation, finddelay(), and the method previously described in this issue all agree that the data set of steps and ramps has a 2-3 sample delay, and that the kp steps and kd steps data sets have no sample delay. I think it will be best to check the sample delay on the larger data set we will have access to next week, from the demo. I have a very loose theory that the delay was caused by a checksum issue that was fixed.

ryancoe commented 1 year ago

...The means are 2.3676e-05 and 2.0934e-05, and the standard deviations are 1.6564e-05, 1.7211e-05, respectively. Units are meters.

Shouldn't the units for the x axes of those histograms be Nm (we command tau=0 and then look at the actual value of tau)? Once we get straight on that, let's find a way to put those numbers in context relative to some nominal torque value (i.e., 2e-5 is very small, but it really depends on what our typical torque levels are -- maybe compare to max torque or RMS torque from a typical test).

...I think it will be best to check the sample delay on the larger data set we will have access to next week, from the demo. I have a very loose theory that the delay was caused by a checksum issue that was fixed.

Sounds good

nickross4444 commented 1 year ago

You're totally right, the units should be Nm. Good news is my theory was right, newer data sets show no measurable delay, so our max loop delay is at most 1/sampling frequency = ~3ms(probably much better than that in most cases). Our typical torque levels are on the order of xe-4. Here is a similar histogram of an entire test, not just the zero commands: Max torque in this case was almost 9e-4. RMS torque was 3.1716e-04 This is a histogram of the same sample set, but filtered to only be zero commands:

ryancoe commented 1 year ago

Looks great! Can you add this to the paper and then close the issue?

SNL-WaterPower / siweed

verify and show demonstration of WEC feedback controller #102