Sandbox - Engine_Validation.md

laberning / openrowingmonitor

A free and open source performance monitor for rowing machines

https://laberning.github.io/openrowingmonitor

GNU General Public License v3.0

98 stars 19 forks source link

Sandbox - Engine_Validation.md #70

Closed Abasz closed 1 year ago

Abasz commented 2 years ago

Hi, I've been reading your engine validation document as I am implementing a rowing monitor for a microcontroller that would be used on an air rower that is rather similar to concept 2 model D. Very good work indeed!

I can see that you decided to use a moving average for the Drag Factor, which makes sense in general. This was first idea to smooth volatility. However, when I tested a concpet2 with PM5 I noticed that when you change the damper on the fly (i.e. while in a rowing session) the new, seemingly correct, DF is shown almost instantly on the next stroke (there were cases when a few strokes are needed). So I assumed that it cannot use any real moving averaging as that would not enable such quick change to the DF.

So I was researching this topic and I came across an article (complicated how) where they quote C2 employee stating that their PMs change the drag factor instantly (to the contrary of early PMs that needed time to change. From this I concluded that moving average is mustn’t be used as if that was used DF changes would not propagate quickly.

Have you ever tried changing the damper while rowing and see what would that do to the DF calc on the PM5 and OMR?

One additional note: Also, I am not sure whether you already know this but PM5 does not calculate DF after a free spinning of 6 seconds, rather resets it self to a default DF. This enables this hack (and here is the explanation and quote of statement from C2). Based on my testing this may be due to the non-linear nature of the flywheel deceleration. On my testing gear (that is not a C2 so the issue may come from this fact) I noticed that if I use a long deceleration the DF becomes huge. Also when I did calculations on different points/ranges of the deceleration currentDt-s the slower they were the worst the data was (seemingly slower flywheel having higher DF). I am not a physicist so I am not sure if drag coefficient should be constant for air on such machine and my results are due to the higher impact of bearing drag on the speed, or just the rower setup is not as good quality or actually at slower speed air flow is different in the machine (in the latter case a DF filter should include a minimum rotation speed for example to avoid inconsistent measurements).

So my second question: have you ever tried to calculate a DF on your C2 with the ORM on a longer period than this 6 second? The reason I am asking is that you are proposing to use linear regression for DF calculation but if the tested curve is non-liner on slower rotation speeds that would in my view invalidate this method.

Please let me know if you have any questions or comments.

Thanks Abász

JaapvanEkris commented 2 years ago

Hi Abasz,

The sandbox doesn't contain most modifications yet, as backporting it to Lars's branch is quite complex (several modifications have quite subtle interactions and Lars changed code as well) and I'm still testing and improving the code. For example, MovingFlankDetector now represents the entire Flywheel metrics (creating Flywheel.js), but I still have to clean some code to bring it in line with its current dominant use.

I haven't tested changing the setting for the damper on the fly yet. It is a test I want to do, but first I want to make sure that all dependent calculations are correct. But Open Rowing Monitor's setting dampingConstantSmoothing (or dragfactorSmoothing as it is should be called) determines how long the running average is, and setting it to 1 works and should result in the behaviour you described. Personally, I would add some smoothing though, as the dragfactor does move a bit. A DF is quite static: it depends on the damper setting, temperature and moisutre, etc.. So sudden changes (except someone playing with the lever) don't happen that often. I'm not sure how stable the DF and associated metrics will stay if you have a dragfactorSmoothing of 1, but an interesting experiment nonetheless. As I can rerun all tests, I can simulate the behaviour with those settings quite easily.

The 6 seconds on a PM5 results in a general pause, indeed resetting some metrics. In my new implementation, the Flywheel stops maintaining metrics (including dragfactor) as soon as a pause starts. Lars' version of OpenRowingMonitor maintains 10 seconds by default (setting maximumStrokeTime), but I suggest maintining 6 seconds specifically for the reason you give: a spinning flywheel starts to deviate from the linear behaviour after longer times.

The huge benefit of Linear regression is that it provides a "goodness of fit" indicator. This can be used as a flag for slopes where the fit with the data isn't strong enough. As Nomath has shown already (see https://www.c2forum.com/viewtopic.php?t=194719), the flywheel behaves quite linear, so the change of getting a good dragfactor quickly. My practical experience confirms that. But if data starts to lose fit, the newly proposed dragfactor will be rejected (hence the many tests). The change in Dragfactor typically comes when the speed is so low that other forms of resistance become more dominant (for example the magnetic drag from the sensor, which is a constant force).

Abasz commented 2 years ago

I did some high-level initial testing of the linear regression model (only in an excel so far on selected curves of real data of the same exercise), the results are pretty good actually!

My biggest surprise was that for the few curves I tested (in the selection there was different stroke length, power, full spin down etc.) the regression approach was not just less prone to be poisoned with errors due to the inclusion of the edges of the previous and next drive phase (and to residual power from the drive that is not sufficient enough to cause acceleration but still present), but also confirmed my hypothesis on which range of the curve should be used (and how much the inclusion can poison DF calculation). Actually if the middle of the range can be used I expect that on my system the two methods may not result in significant difference (the difference between the standard deviations are not that significant). But if I am not able to clean the DF figures significantly (i.e. some values from the edges are included) the start-end velocity approach starts giving crappy results. This may change if more magnets are added but I am not sure.

I believe that my reed switch and the magnet used as well as using an interrupt service yields quite good and clean currentDts (apart form the debounce of the reed switch that needs cleaning) since in all cases the r2 is above 99.9% even after 18 seconds of spin down (after that things starting to get pretty bad). What I will need to confirm or decide is (i) whether adding more magnets could yield cleaner data from the edges of the phases (probably the answer is yes) (ii) would more magnets make the data less clean in general (e.g. due to not perfect placement of the magnets). Though my conclusion is that with the linear regression more magnets are not really necessary (but if this is the case why would Concept2 switch to a 12 pole magnet from the original 3 (apart from the power generator of course) effectively having 6 impulses per rotation instead of 3.

In addition on the tested curves I was able to identify that the 6 second (probably +/-1) is a good cut off time. I attached an excel spreadsheet if you interested: RowingMonitor-LinearRegression vs. Start-End Velocity.xlsx

I had one concern: a regression calculation with significant items could take up material processing time on a microcontroller (especially if you have huge array of data). So having such thing in an ISR function seems a bad idea (I haven’t tested speed yet so this is just my starting hypothesis). The ISRs should finish within the microseconds range (i.e. should be definitely below a millisecond). For instance my ISR, if initial debounce filter passes, generally finishes within 100 microsecond (acceptable), but in certain cases it needs 500 microseconds (that is pretty bad, hence the less robust filters)…

So if a PM would be using such approach it would need a relatively powerful microcontroller with efficient floating point calculator! I have done some research on what the PMs could use. To my surprise I find this site. According to this the microcontroller that is being used in new PMs could be considered quite powerful (nearly 170MHZ, compared to ESP32’s 240MHZ) in microcontroller terms. SRAM and Flash could also be sufficient. Of course these guys are also better/more efficient programmers than me so I suppose they can achieve good optimizations for the ISR 😊. At some point I will test how long the regression calculation on different size of data arrays take on the ESP32 to see if a linear regression method can in theory be used by the PM or the less robust but less resource heavy start-end angular velocity approach is the more likely approach that is being applied.

JaapvanEkris commented 2 years ago

I suspect Concept2 switched to a 12 pole magnet for three reasons:

The stroke detection can kick in earlier, providing more direct feedback
the powercurve gets more detailed data, thus resulting in a more fluid feedback. When your biggest competitor (RP3) is kicking your butt by showing of their curve, you want to show yours :)
They actually use the power from the 12 pole magnet to power the PM5 with 15 volts. And that, a reed switch can't do.

The processing time for the Linear reggression isn't much. When you look at the code of LinearRegressor.js, you see all you need to do is calculate some running sums for each impuls in the recovery phase, and you can calculate the slope. OLS is dirty, sometimes robust but above al quick. Higher quality algorithms are more time-consuming, and could kill the entire rowing engine's performance. But OLS is pretty quick.

Abasz commented 2 years ago

After looking at your code I implemented (ported rather) the regression algorithm and run through a real exercise data (5x2 min intervals with 2min/500 pace with 2 min rest plus warmup and cooldown). Actually, for some reason I originally thought that the regression algorithm should keep an array of the currentDts and then at the DF calculation do an accumulation (that would have a onetime computation expense). For some reason I did not think of this to be implemented as a running sum that would spread the computation needs to individual rotations. So you are absolutely right that it’s not expensive.

So the results of the test is pretty good, even with no running averaging (just filtering based on degreeOfFit). The standard deviation of the regression model is way better than the start-end velocity approach even if using the middle of the recovery phase (which approach as you said puts strain on the robustness of currentDt filtering algorithm).

What I noticed is that there are increases in the DF (below is a chart of comparison of the regression and start-end velocity). These are at the times when I stopped for rest (complete spin down of the flywheel) and then slow rowing (minimum effort, low stroke rate). The increases are present in both the regression and start-end velocity. The interesting part is that the degreeOfFit on these slow rotating curves with the increased DF is still above 0,9996 (that was the degreeOfFit filter I applied). Actually some had like 0,9999. Also looking at the chart, the warmup (that is at slower pace) has higher average DF (just slightly though), while the intervals have lower. So I concluded that these DF values should be correct and either the air flow changes at slower spin rate creating higher drag with same damper settings or other drag factors (unbalanced flywheel wobbling, magnet for the sensor as you mentioned, bearing drag etc.) increase their effect (the latter is more likely). Thus, in my view there is not much action to do with these (e.g. the need for smoothing these out) as if the drag is higher this should be reflected in power, distance etc.

JaapvanEkris commented 2 years ago

Thanks for the feedback.

Stupid question, do you use the filtered or the clean CurrentDt? Because in my testing, the clean had extremely high dregrees of fit, but that made it impossible to filter bad data. When I used the raw (unfiltered) currentDT's, the degree of fit was much more useable as a filter criterium.

Abasz commented 2 years ago

The short answer is that I use the filtered values. But let me show why this does not matter for my case. First you need to understand the filter machanism (that is rather simple):


auto currentRawDeltaTime = now - previousRawRevTime;

previousRawRevTime = now;

if (currentRawDeltaTime < Settings::ROTATION_DEBOUNCE_TIME_MIN * 1000)
    return;

auto currentCleanDeltaTime = now - previousCleanRevTime;

auto deltaTimeDiffPair = minmax<volatile unsigned long>(currentCleanDeltaTime, previousDeltaTime);
auto deltaTimeDiff = deltaTimeDiffPair.second - deltaTimeDiffPair.first;

previousDeltaTime = currentCleanDeltaTime;
    // We disregard rotation signals that are non sensible (the absolute difference of the current and the previous deltas exceeds the current delta)
if (deltaTimeDiff > currentCleanDeltaTime)
    return;

Short explanation:

A simple debounce: purpose of this is to filter the reed switch debounce (the reed switches work in a way that the switch bounces back and forth for a couple of times before settling, and the rotation interrupt is triggered within a millisecond or two). Here I use the a time (in microseconds) when the interrupt was hit and if that is too quick I disregard it
A comparer for the difference: the idea is that first I create a current Dt that based on the time since the last valid rotation currentDt. Then I check the difference between this currentDt and the last valid currentDt (cleanDeltaTime and previousDeltaTime in my code) and if that difference is greater than the currentDt something must be wrong (some residual debounce). Otherwise I accept the value valid.

So the potential issue with this is that this algorithm never disregards any time. Since if a raw currentDt is disregarded it is added to the next one. Of course with this it could happen (and factually does, but this is an accepted limitation that is dealt with in an other way) that in the drive phase two rotation is considered as one, if someone makes a very strong stroke (and the decrees of the currentDt for the flywheel acceleration is in very big steps). This cannot be an issue for recovery as the deceleration is always gradual and follows an exponential curve on my machine. However this limitation factually is not an issue for two reasons:

I do not use the drive phase’s individual currentDts for anything, only their total time is relevant, and that is correct (since the issue of mixing two rotation into one is only occurring when the flywheel is very slow or stopped and quickly accelerated with a very hard pull)
at some point of the middle/end of the drive phase there is a self-correction (when the currentDts increase becomes less steep) so by the end of the drive when data becomes relevant for the stroke detection mechanism, the currentDts will be clean and quite good.

To this to work it is important that the starting data is clean. Based on the raw data I had, the sensor with only one magnet (so the magnets relative potential inaccurate position to other magnets or their potential slight variation in the magnetic effect on the sensor) and the ESP32 produces somewhat clean data (apart from the initial debounce). Of course a hall effect sensor would be probably better (as used by concept2) but the issue with those is that its hard to find a good one that is operational on such a low power as 3-5volts (or at least I was not able to find one). At the biggening I tried to experiment with those but I failed. I concluded that a higher quality reed switch (with max 400Hz switching capability and potential high resistance against shaking) is a much much simpler solution. Also, my devices works from a battery and reeds do not consume power (not that a hall effect sensor uses too much, but still more than 0).

So back to your question: i.e. using the raw currentDts. Below you can find a chart of a couple of strokes of fully unfiltered raw currentDts (i.e. without even the 1. Stage filter). You can see that the data looks good but poisoned with the debounce from the reed:

Now below is a chart of applying the first stage filter (where the debounce is set to 15 milli seconds):

You can see that at high rotations there is no issue, curves are nice. But when the flywheel slows down there is a residual debounce again. But, the data is already very clean just with a simple first stage debounce filter.

This should mean in my opinion that actually my setup produces clean data by default and there should be no difference for using the raw values for the regression. This may change of course if I add magnets. In my view the primary source of the clean data is that I have only one impulse per turn. This should mean that potential noise that is caused either by the not perfect placement of the magnets (i.e. one is slightly further from the other hence that signal will be systematically longer the next one shorter), and that even if my sensor is not switching on at exactly the same time when the magnet comes close (I assume there should be a few – 10-20, may be more – microsec difference) this becomes negligible as only occurs once per rotation and not cumulating.

To show this I actually disassembled my flywheel and I created a stencil for 6 magnets with a cutting plotter in a cardboard (probably a 3d printed magenet holder would be much better, I might try this one :)) to help with the accuracy of the magnet palcement. Of course the placement is not perfect (visible on the measurements below). You can see data (fully clean currentDts) recorded for 6, 3 and 2 magnet (note, there is no moving average to smooth the volatility out these are only filtered by the above simple algorithm, of course the ROTATION_DEBOUNCE_TIME_MIN has been decreased appropriately to the number of magnets):

Its immediately apparent how the numbers of the magnets affect the volatility of the data when magnets are not perfectly placed. With more magnets the potential inaccuracies are accumulating and smoothing is needed. Of course concept2 has more accurate equipment to place the magnets and the sensors so for them this is probably fine, but still. Would you be able to show me a filtered and unfiltered data of a few strokes from the concept2?

I would be very interested in some raw concept2 signals, as in Nomath posts in the concept2 forum was no really continuous data. It would be interesting to see what signals are produced by that sensor. Would you mind uploading some currentDts (raws) of a couple strokes in an excel file here?

Thanks Abász

Abasz commented 2 years ago

@JaapvanEkris JaapvanEkris I've been observing the changes you are making to the algorithm drag factor calcualtion. Specifically that you are using time to do a sampling. I experimented with that and it turned out to be unreliable as it would depend on the speed of the flywheel (e.g. with a slower flywheel with fewer impulses 0.1 sec may not even create an impulse). Any way I found that using the number of impulses is more reliable. E.g. discarding the first 5 impulses and use lets say 10-20 (again this needs to be adjusted based on the number of impulses). Alternatively you can use wheel turns, like discard 3 turns and than sample 20. In my view this would yield more consistent result as the sampling becomes flywheel speed dependent.

JaapvanEkris commented 2 years ago

Hi Abasz, good point! I'm still in early Alpha code here. I already made it a parameter (thus you can change the 0.1 seconds from the config file), but making it dependent on the number of impulses might also be an approach. I'll play with it, see what happends :)

Abasz commented 2 years ago

@JaapvanEkris I was wondering whether you would be able to share with me some of your test data recordings? I would like to run those trough my algorithm to see the results and the potential differences.

I would start with a steady row, and an interval if available.

thanks in advance.

JaapvanEkris commented 2 years ago

Hi @Abasz

I tested several Linear Models, both for Drag Calculation and the Angular Velocity and Angular Acceleration. So far, for Drag the good old OLS outperforms (in terms of stability of the calculated dragfactor) both Theil-Sen and all Quadratic (thus non-linear) regression methods, at least with my machine. For Angular Velocity and Angular Acceleration Incomplete Theil-Sen outperforms OLS and Quadratic Theil-Sen. But that is with a lot of datapoints: approx 300 per stroke. That number of datapoints might justify the assumption of linearity.

In https://github.com/JaapvanEkris/openrowingmonitor/tree/Raw_Implementation_As_Is I have uploaded and updated several files: first of all C2_RowErg_2022-05-28_0929_Rowing_30_Minutes_Drag_89_Raw.csv (raw impuls data) and C2_RowErg_2022-05-28_0929_Rowing_30_Minutes_Drag_89_RowingData.csv (processed outcome data per stroke). I also added https://github.com/JaapvanEkris/openrowingmonitor/blob/Raw_Implementation_As_Is/config/config.js as you need these settings to get close to my data (please note, some settings are really at the end of the file). I also updated Flywheel.js, RowingEngine and RowingStatistics to the last known version.

Calculating DF for longer than 6 seconds doesn't work with us either with the current settings: it will assume the rower has stopped rowing, stopping the metric collection. You will also pass the threshold for credible data quite quickly, so it will flatline. Looking at the raw data, I do know that further down the tail the signal becomes noisier, and the assumption of linear behaviour is at risk.

Let me know how it goes and if you need any help!

Kind regards,

Jaap

Abasz commented 2 years ago

Thanks for the data. I've been playing around with it, I was quite surprised how dirty the impulse timings are though (there are big spikes and drops out of the blue). I've been reading about Rpi interrupts and it seems that it has no true hardware interrupt (but rather interrupt managed by software like linux). This potentially means that actually measurement timings may be delayed with a variable latency (this does not necessarily explain the drops tough). I theorize that the drops may be caused by queued up interrupt triggers.

Anyway, after doing some tweaking on the array length and the error threshold I was able to replicate the number of strokes.

My drag factors are within 3% error on avg that I deemed good. Especially since I am using the smoothed data for calculating the drag factor and my drive and recovery times differ from the ones determined by your algorithm (in this sense yours is probably more accurate as I it uses raw data while mine uses smoothed one). This should mean that the start and end of the stroke phases are not identical to yours, but thanks to the regression model (and its robustness against errors) it gets pretty close. Below is a chart for comparison:

On the power side (apart from those drops that relate to the noise I cannot properly clean due to the way my impulse filter is implemented) its pretty good. The DF is consistently below yours.

I was wondering whether you would have the raw drag factors for this data? Or if you can push your changes to the "as_is" branch so I can run the raw data you provided and get these information.

Finally do you may be have an interval session to replay? Where you stroke intensity varies? I am asking, as my conclusion on my machine (its a cheap Chines Concept 2 clone :)) drag factor fluctuates more in case of interval sessions (I theorize this is due to bearing drag and the increasing effect of the unbalanced flywheel).

Thanks

JaapvanEkris commented 2 years ago

Hi Abasz,

Noise in the data is a big thing. That is why I'm moving to a more robust Theil-Sen Estimator for both Angular Velocity and Angular Acceleration as it behaves much better. If you look at the current code, you will find a solution that handles noise extremely well at the cost of some accuracy (in the maximum force excerted on the flywheel to be precise). I agree that the noise could come from Linux itself, or the Java-NPM which isn't designed with real-time in mind. I consider switching to another Linux implementation (with better real-time behaviour) and potentially even moving the interrupt handler to C. But I rather work on making the algorithm noise-proof first, as many machines have similar issues (misaligned magnets).

Stroke detection is still a work in progress for me. It is pretty decent, but as I'm progressing to powercurves I do see situations where I expect the stroke detection to switch from drive to recovery, where it doesn't. Could be a parameter thing, could also be missing some criterium that defines a recovery.

A thing to look at is the filter you apply. I used to apply a moving average on most measurements, which reduces the impact of outliers but doesn't kill them. Now, I switched to a moving median, which removes the most extreme outliers quite well. I use it in mst filters now with quite some success.

Regarding the drag factor: look at getting stable stroke measurements first. I noticed that slight variations in stroke detection have a measureable effect on the drag factor as the flanks are included. So getting good into the right part of the flank is essential there. What I also do is measure the goodness of fit, which is cheap when using OLS, and reject the newly calculated drag factor when it is too bad. That removes most outliers for me. Currently I run at 0.88 as minimal R2 before I accept a new DF.

The current as-is branch is up to speed with both code and settings. I'm slowly migrating to full Quadratic Theil-Sen to decently capture both Angular Velocity and Angular Acceleration. It has some side-effects and issues, but the essence is pretty good. I'll look if I can capture the logs from a rowing session, the raw values are mentioned there.

Unfortunatly I'm currently injured, blocking me from interval training (steady state only for a couple of weeks). What I can do is provide you with a much faster 500 meters (PR attempt).

JaapvanEkris commented 2 years ago

Hi @Abasz,

At https://github.com/JaapvanEkris/openrowingmonitor/blob/Raw_Implementation_As_Is/recordings/C2_RowErg_2022-05-28_0929_Rowing_30_Minutes_Drag_89.log you'll find the log with all drag decissions in them..

One thing I did notice in my queste for the perfect drag calculation is that the R^2 becomes much more valueable if you feed it raw data. Cleaned data tends to deliver artificial high R^2's. When you use raw data, the bad data can easily be filtered.

Abasz commented 1 year ago

Hi @JaapvanEkris JaapvanEkris,

I can see that you made some commits to the Sandbox branch, would you mind pushing your current as is implementation to the Raw_implementation_as_is? I am asking as the Sandbox branch does not have all the changes and seemingly still runs the original ORM drag-factor and stroke detection algorithm.

thanks

JaapvanEkris commented 1 year ago

Hi @Abasz ,

I'm moving the code to a more final version, and bringing the branches (mine and Lars' branch) together in the sandbox. I'm currently working on a clean install to start integrating them on top of a low-latency kernel. This involves moving or deleting complete functions, so it needs to be done with extreme care. I ran into an issue last weekend as the SD-card for the clean install died on me. It is a bit more complex as I need to rewite all test-code as well. But the sandbox will be the place where the magic will happen in the future (before it will stabalize and I create a pull-request to Lars' mainline).

Jaap

Abasz commented 1 year ago

Hi Jaap,

Finally, I was able to get some time on a C2 model D with a PM5 perf monitor. I did an interval session to confirm a theory of mine that partially confirms, partially tries to challenge your conclusion. The session was pretty simple 5 min warm up, 3x1 minute with 1 min recovery finishing with approx. 3 min cooldown.

A few words on the setup

I read that you use an optocoupler for stepping down the voltage. Now while this is an ok solution it has drawbacks (as you noted somewhere as well). So I decided to use a simple MOSFET level shifter. I used BS170 with 3 resistors (this was the transistor I had at home). I needed 3 because the BS170 has a Vgs breakdown voltage of +/-20v and I was afraid that the C2 produces some occasional high voltages, so I added a voltage divider to half the incoming signal. This would yield a switching voltage of 4v (on the contrary to the 12v your optocoupler has) enabling to capture some slower rotation speeds. The MOSFET is a good and pretty simple solution it is used for I2C signal level shifting that is in the Mhz range and accuracy is utterly crucial. So effectively this should produce a proven consistent signal.

For processing I use an ESP32 chip, on a FireBeetle board with digital interrupt. I simplified the code only to record the rotation delta times (i.e. without any filter or smoothing) to the console that may be processed in various ways later. This is to avoid any missed impulses due to the interrupt taking too long. I attached the file containing the raw delta times as well as the delta times slightly cleaned by an algorithm (the latter are non avg-ed deltas, I only cleaned the debounces that are produced).

I also recorded the session to C2 logbook, I attached the csv file. Also I watched the drag factor while I was doing the interval session and my most important observation was that the drag factor was fluctuating between 79-81 (the lowest I saw as 78, while the highest was 83). From visual observation the DF seemed to correlate to the speed of the flywheel, i.e on the 1 minute recovery, low effort rowing, showed higher drag (80-81) than on the high intensity (mostly consistently 79 with occasional 80). Logbook shows avg DF of 79

I think this is important as your approach uses a drag smoothing which clearly not used by PM5. Also, it confirms that the friction of the bearings (the effect of which is higher on slower rotations) could have material impact on the DF. Hence, I believe that the difference of your ORM implementation and the PM5 you noticed (and the result of which you concluded that the inertia may be 0.1016 instead of 0.1001) could potentially come from the fact that on certain rotations you underestimate the DF compared to the PM (as a result of the smoothing). But I am not able to directly challenge this, please see below explanation.

The Data

On the below chart you can see that there are a quite a lot of debounces but after a simple clean the data can be made reasonably consistent (again lower rotation has longer debounce time that is harder to clear).

The first thing that I noticed is (especially on the deceleration curve) that there is a reoccurring difference between the data points measured within one rotation. I think this is probably caused by the non-perfect alignment of the magnets (i.e. that one magnet is closer to the other) so there are points where the impulse delta is shorter while on the next one longer. I have the same issue on my clone machine, and this is the main reason I use only 3 magnets on that. This makes detecting stroke phases pretty problematic (as you mentioned couple of times that data quality is important). Nevertheless, I used a flank of 5 for stroke detection (also averaging 6 points, 5 from the array plus the new one) that mostly cleaned this. My algorithm detected 222 strokes (PM5 detected 226).

Drag factor

I use the raw (i.e. unaveraged) delta times for calculating the DF (based on my test I could go down to 0.8 on the goodness of fit). Also, I use the full deceleration curve (i.e. I do not take just a portion after the start and before the end). On the below chart I fitted on two scales the power and the DFs so it clearly shows that on lower power the DF is higher (actually on my machine this difference is even more apparent, that I believe comes from the fact that its bearing system and balancing is not as good as the C2). I do not use decimals (I round them) as the PM5 does not show decimals either and it makes visual comparison easier.

Conclusions

With the inertia of 0.1016 the DFs are higher than what I visually inspected while the DFs with inertia of 0.1001 are more what I would expect (also the avg DF for 0.1016 is 80 while 0.1001 is 79). Important that I would need to test a higher DF as well as based on my initial experience DF fluctuation on the PM5 is even greater on higher DF (not necessarily percentage wise more like absolute value wise).
Its important that the effect of the non air drag on the flywheel (e.g. bearing friction), that causes the DF to fluctuate on lower spin rates, is properly considered especially on machines that are not well balanced.
Changes in the DF from one stroke to the other is not necessarily a flaw in the algorithm, as PM5 changes as well. Probably the age of the machine has a material impact on this (e.g. bearings getting older). But this is compensated by the calculation method, since the power is calculated with the last DF it accounts for these anomalies. There is one thing that could potentially be an issue: this is where you do a slow rotation (DF increases slightly) and then next a strong pull. On this power would be overestimated. With an example, one slows down the flywheel so the DF is 82 (on the same lever setting of course), then does a strong pull, the power there would be calculated with the 82 DF while, in reality, on the pull (due to the increase of the flywheel speed) the DF should be only 79. Nevertheless, this is a very minor issue, and cannot actually be abused or used for cheating efficiently.

Abász

JaapvanEkris commented 1 year ago

Hi Abasz,

Good to hear from you. I made some progress as well, as you might have seen in the logs from the Sandbox.

I think this is important as your approach uses a drag smoothing which clearly not used by PM5. Also, it confirms that the friction of the bearings (the effect of which is higher on slower rotations) could have material impact on the DF. Hence, I believe that the difference of your ORM implementation and the PM5 you noticed (and the result of which you concluded that the inertia may be 0.1016 instead of 0.1001) could potentially come from the fact that on certain rotations you underestimate the DF compared to the PM (as a result of the smoothing). But I am not able to directly challenge this, please see below explanation.

The first thing that I noticed is (especially on the deceleration curve) that there is a reoccurring difference between the data points measured within one rotation. I think this is probably caused by the non-perfect alignment of the magnets (i.e. that one magnet is closer to the other) so there are points where the impulse delta is shorter while on the next one longer. I have the same issue on my clone machine, and this is the main reason I use only 3 magnets on that. This makes detecting stroke phases pretty problematic (as you mentioned couple of times that data quality is important). Nevertheless, I used a flank of 5 for stroke detection (also averaging 6 points, 5 from the array plus the new one) that mostly cleaned this. My algorithm detected 222 strokes (PM5 detected 226).

I moved to another approach: I now use quadratic regression over time vs distance. As the Theil-Senn linear regression model is reasonably robust against noise, these small wobbles are becomming irrelevant. First derivative is the speed, second derivative is the acceleration. I am now optimising the algorithm to fit the behaviour of the Concept2, but I'm quite impressed how this reduced the noise in the data. In the As-Is the current implementation is present, in the Sandbox several additional exerimental versions are present.

I use the raw (i.e. unaveraged) delta times for calculating the DF (based on my test I could go down to 0.8 on the goodness of fit). Also, I use the full deceleration curve (i.e. I do not take just a portion after the start and before the end). On the below chart I fitted on two scales the power and the DFs so it clearly shows that on lower power the DF is higher (actually on my machine this difference is even more apparent, that I believe comes from the fact that its bearing system and balancing is not as good as the C2). I do not use decimals (I round them) as the PM5 does not show decimals either and it makes visual comparison easier.

I currently use the raw data as well, with a well optimized stroke detection (which actually is based on the dragfactor: it will detect a recovery when the slope of CurrentDT is close enough to the dragfactor), and I typically encounter a GoF of around 0.97. Cut-off is around 0.83, but I encounteer that 2 to 3 times per session.

What I do notice is that your power is quite high/impressive, are you reaching that on the PM5 as well?

Changes in the DF from one stroke to the other is not necessarily a flaw in the algorithm, as PM5 changes as well. Probably the age of the machine has a material impact on this (e.g. bearings getting older). But this is compensated by the calculation method, since the power is calculated with the last DF it accounts for these anomalies. There is one thing that could potentially be an issue: this is where you do a slow rotation (DF increases slightly) and then next a strong pull. On this power would be overestimated.

Here, Concept2 has a flaw in its algorithm that actually underestimates the power of such a stroke. Their calculation omits the power to accelerate the flywheel. Unversity of Ulm discovered that one. I still haven't considered how to handle that (as I would like to correct this in ORM, but that would make it deviate from the PM5).

Abasz commented 1 year ago

I have been following the developments on the regression idea. Also I uploaded my ESP32 implementation and referenced your project indicating that it is a limited port. Let me know if you have any issue with this.

In terms of the power:

The sum of the PM5 power is higher than the cumulative power recorded by my algorithm, but the peaks are slightly smaller. I actually do flat water kayak racing (I no longer race though :)) so delivering power never really was a problem (also these are not that long intervals and this is a short session).

The one thing I just noticed while looking at the data again is that PM5's power never actually went to 0 even though I know I stopped after the interval (at least did not do a stroke for more than 7 seconds). So the difference on the peeks at the start is probably coming from the fact that some of the power is allocated to the previous stroke. I will try to look into this in the upcoming weeks. One fact is that my algorithm sets the current avg stroke power to 0 when it transitions to the stopped state.

Also the shift is a result of the fact that PM5 detected more strokes.

JaapvanEkris commented 1 year ago

In some comparisons, I've noticed that the PM5 has badly detected strokes: strokes that last 0.5 sec, preceded/followed by a 2 sec stroke (while I am in a Steady State 22 SPM). Might be my technique, but when comparing, it is something to look out for.

The PM5 does some odd things indeed. I notice it with dragfactor as well: I can approximate their times quite consistently with a 0.1016 inertia (within 0.03%), but the dislayed dragfactor is consistently 1 point lower.

I am going to look at the power calculations (I was focussed on speed as that is easier to validate.

JaapvanEkris commented 1 year ago

Let's close this pull request as the "Backend_Redesign" supersedes it and is a true self-contained pull request.

laberning commented 1 year ago

Closed since this is superseded by PR #84