Open moorepants opened 9 years ago
So this all seems to work. It gives the same results as the Octave file on the test data but when I run it for the plot in our perturbed data paper I'm seeing some noise in the joint torques in the Python version that I don't see in the Octave version. I must have some slight differences in the filtering/differentiation code. Here is a screenshot:
@tvdbogert Can you comment on the filtering/differentiation? You have a custom filterer differentiate. I switched to using a 2nd order Butterworth filter, with filtfilt, and central differencing.
I changed it so I filter the marker positions, compute the derivative, filter the velocities, and compute the derivative instead of filter the marker positions, compute the derivative, and compute the derivative and I get results more similar to the Octave file.
FYI: Images are swapped here with respect to the above. But the hip joint torque is still more different than I'd hoped.
Good to know that Octave performs so poorly on this code.
I looked in your speedup inverse dynamics branch but could not locate the python code. Pls give a link if possible.
I think I know what is going on. My rtfilter.m processes data with time stamps, one sample at a time. It can handle gaps in the data. If you use standard digital butterworth filters, you need a constant sampling rate. The spikes may be generated if you do the differentiation with the assumption of constant sampling rate. If a frame drops out, you then suddenly get only half of the true derivative. The second derivative then gets spikes. This may happen in a certain part of the gait cycle when the hip marker drops out sometimes. Filtering again after the differentiation will get rid of it, but that just hides the problem and in fact distorts the signal.
In fact, you should never filter more than once in this sequence of steps. Filtering and differentiation are both linear operations on the signal, and you can change their order. So filter only once, at the beginning or at the end. My filter has the advantage that the filtering and differentiation is combined into one step, which reduces the distortion from finite differencing.
The for loop in myfiltfilt.m is probably the bottleneck for Octave. It may be possible to vectorize that, I would have to take a real good look at rtfilter.m. The problem is that the filter coefficients depend on the time between two samples. It may be possible to code that without loops. This code was written for real-time (I originally wrote it in C, Motek has that code in D-Flow) so it processes one sample at a time.
I would like to keep the Matlab code around, with the Python equivalent in a separate folder. In that case, the Python code should produce exactly the same output.
Here are the options as I see them:
Whatever we do, the Matlab/Octave and Python code should give identical results.
I looked in your speedup inverse dynamics branch but could not locate the python code. Pls give a link if possible.
inverse dynamics: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR1084
mimics leg 2d: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR149
numerical differences: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L575
filter: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L462
I think I know what is going on. My rtfilter.m processes data with time stamps, one sample at a time. It can handle gaps in the data. If you use standard digital butterworth filters, you need a constant sampling rate. The spikes may be generated if you do the differentiation with the assumption of constant sampling rate. If a frame drops out, you then suddenly get only half of the true derivative. The second derivative then gets spikes. This may happen in a certain part of the gait cycle when the hip marker drops out sometimes. Filtering again after the differentiation will get rid of it, but that just hides the problem and in fact distorts the signal.
That may be it, but the gaitanalysis.motek.DFlowData
has always removed the missing markers (by interpolation) before the data ever got to leg2d.m
.
In fact, you should never filter more than once in this sequence of steps.
I thought that too, but oddly filter twice changes the result in this case. I must have a slight error somewhere.
The for loop in myfiltfilt.m is probably the bottleneck for Octave.
Yes, it definitely is. See the timings here: https://github.com/csu-hmc/GaitAnalysisToolKit/issues/30
That loop would ideally be in a low level language. I'm not sure if vectorizing it would help that much.
I would like to keep the Matlab code around, with the Python equivalent in a separate folder.
We can do that. I currently have a unit test that tests my code against yours:
The results for the rawdata.txt
file are virutally identical with the two methods, but not exact. I ended up using an R^2 for comparing the results due to slight differences in the end points. When I plotted the results I was getting visually identical results. I'm not sure why I get the "same" results with that data and not our perturbed data.
Translate the existing filter code into Python (myfiltfilt.m and rtfilter.m). It's not a lot of code. If that still runs too slow, consider the other options.
Yes, this could be done. But I'm pretty sure that code needs to be written in C or something for speed purposes. A direct translation will likely have some speed gains relative to Octave but it want be the 1000x speedup I'm seeing using the constant sample rate filtering here.
Make a compiled version of myfiltfilt.m (which then also includes rtfilter.m), and call that from Python (or Matlab or Python). I will e-mail you the C source. Not sure how much work that is.
This is likely the best solution.
Give up on the filter that can handle variable sampling rate and gaps. Then we need to interpolate the gaps and then use a standard butterworth filter for constant sampling rate. And then do finite difference derivatives. Do this both in the Matlab and Python versionbs. To me that is clunky because I know better. If you have large gaps, and do linear interpolation, you get zero accelerations during that time. Higher order polynomial interpolations become unreliable.
This is what is implemented now, in this PR.
Whatever we do, the Matlab/Octave and Python code should give identical results.
Agreed.
See if my revised Matlab/Octave code helps at all. It eliminates the function call from the for loop.
If you already interpolated before the data gets to leg2d, my explanation for the spikes is not valid. My filter would have produced the same spikes on such data. Hope you can track that down.
Also if you already interpolated we can safely do a standard butterworth filter (in Octave this would be "butter" and "filter", "filtfilt") and the Python equivalent. It's not elegant but for small gaps of a few frames this makes no difference really.
In leg2d you then need to do a check:
if (max(diff(timestamps)) ~= min(diff(timestamps)) error('sampling rate is not constant' end
My test data in rawdata.txt has missing samples, so that would not be a good test anymore.
One thing you may not have noticed in myfiltfilt.m. If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).
A C version would be good to have.
On 3/30/2015 12:49 PM, Jason K. Moore wrote:
I looked in your speedup inverse dynamics branch but could not locate the python code. Pls give a link if possible.
inverse dynamics: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR1084
mimics leg 2d: https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128/files#diff-5e1f4c39e8343cdd4b4dedb9eda0a0efR149
numerical differences: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L575
filter: https://github.com/moorepants/DynamicistToolKit/blob/master/dtk/process.py#L462
I think I know what is going on. My rtfilter.m processes data with time stamps, one sample at a time. It can handle gaps in the data. If you use standard digital butterworth filters, you need a constant sampling rate. The spikes may be generated if you do the differentiation with the assumption of constant sampling rate. If a frame drops out, you then suddenly get only half of the true derivative. The second derivative then gets spikes. This may happen in a certain part of the gait cycle when the hip marker drops out sometimes. Filtering again after the differentiation will get rid of it, but that just hides the problem and in fact distorts the signal.
That may be it, but the |gaitanalysis.motek.DFlowData| has always removed the missing markers (by interpolation) before the data ever got to |leg2d.m|.
In fact, you should never filter more than once in this sequence of steps.
I thought that too, but oddly filter twice changes the result in this case. I must have a slight error somewhere.
The for loop in myfiltfilt.m is probably the bottleneck for Octave.
Yes, it definitely is. See the timings here: #30 https://github.com/csu-hmc/GaitAnalysisToolKit/issues/30
That loop would ideally be in a low level language. I'm not sure if vectorizing it would help that much.
I would like to keep the Matlab code around, with the Python equivalent in a separate folder.
We can do that. I currently have a unit test that tests my code against yours:
The results for the |rawdata.txt| file are virutally identical with the two methods, but not exact. I ended up using an R^2 for comparing the results due to slight differences in the end points. When I plotted the results I was getting visually identical results. I'm not sure why I get the "same" results with that data and not our perturbed data.
Translate the existing filter code into Python (myfiltfilt.m and rtfilter.m). It's not a lot of code. If that still runs too slow, consider the other options.
Yes, this could be done. But I'm pretty sure that code needs to be written in C or something for speed purposes. A direct translation will likely have some speed gains relative to Octave but it want be the 1000x speedup I'm seeing using the constant sample rate filtering here.
Make a compiled version of myfiltfilt.m (which then also includes rtfilter.m), and call that from Python (or Matlab or Python). I will e-mail you the C source. Not sure how much work that is.
This is likely the best solution.
Give up on the filter that can handle variable sampling rate and gaps. Then we need to interpolate the gaps and then use a standard butterworth filter for constant sampling rate. And then do finite difference derivatives. Do this both in the Matlab and Python versionbs. To me that is clunky because I know better. If you have large gaps, and do linear interpolation, you get zero accelerations during that time. Higher order polynomial interpolations become unreliable.
This is what is implemented now, in this PR.
Whatever we do, the Matlab/Octave and Python code should give identical results.
Agreed.
— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87747314.
See if my revised Matlab/Octave code helps at all. It eliminates the function call from the for loop.
Testing now.
Hope you can track that down.
I'll investigate more.
My test data in rawdata.txt has missing samples, so that would not be a good test anymore.
For my unit test I pre-interpolate the missing samples and pass that into both the Python and Octave code to see if I get the same results.
One thing you may not have noticed in myfiltfilt.m. If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).
I didn't know this. Maybe that's the difference. I'll fix my code.
A C version would be good to have.
I can implement the C version if you are fine with including your method in this package. Just let me know.
If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).
@tvdbogert Where do I find the theory for this? I've never heard of it and would like to implement a generalized version.
I found some reference here for 2nd order filters:
Do you know if there is a general formula for any order Butterworth filter?
In the code I cite the Winter book which explains this, but it's not widely known apparently.
If you apply a filter H(w) twice, the transfer function is H(w)^2.
Nth order Butterworth transfer function, magnitude, is:
|H(w)| = 1/sqrt(1+(w/w0)^(2n))
w0 is the cutoff frequency, where transfer magnitude is 1/sqrt(2).
applied twice:
|H(w)| = 1/(1+(w/w0)^(2n))
If you define cutoff frequency as the frequency w0 where transfer is 1/sqrt(2), and you want this frequency to be w0, you need to solve:
1/(1+(w0/w0corrected)^(2n)) = 1/sqrt(2)
Where w0corrected is the design parameter you need to put in the original single BW filter to achieve this. The solution is:
w0corrected = (sqrt(2) - 1)^(1/2n) * w0 = 0.802 * w0 (when n=2)
If you don.t do this, and you apply a 6 Hz lowpass filter twice, you end up with a filter that attenuates 6 Hz signals by a factor 2, rather than 1/sqrt(2) which is what you intended.
On 3/30/2015 10:48 PM, Jason K. Moore wrote:
If you run a 2nd order Butterworth filter twice (to eliminate lag), it needs to be designed with a slightly higher cutoff frequency than what you really want. The factor is 0.802. That way, the transfer function of the double butterworth filter, has a magnitude of 1/sqrt(2) at the cutoff frequency (and not 1/2).
@tvdbogert https://github.com/tvdbogert Where do I find the theory for this? I've never heard of it and would like to implement a generalized version.
— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87909235.
fThat's a very nice notebook. Good to see that he used Pezzack's benchmark data, which was one of the first things I put on the ISB website in 1996.
He gives a formula for any filter order n, but I get a slightly different one (see my github comment above). Both give the same answer for n=2, but I think only mine is correct for other filter orders.
On 3/30/2015 11:12 PM, Jason K. Moore wrote:
I found some reference here for 2nd order filters:
Do you know if there is a general formula for any order Butterworth filter?
— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/GaitAnalysisToolKit/pull/128#issuecomment-87918232.
That's a very nice notebook. Good to see that he used Pezzack's benchmark data, which was one of the first things I put on the ISB website in 1996.
He's creating a whole note set (or textbook) here: https://github.com/demotu/BMC
He gives a formula for any filter order n, but I get a slightly different one (see my github comment above). Both give the same answer for n=2, but I think only mine is correct for other filter orders.
I've adjusted my butterworth
function with the correction factor here: https://github.com/moorepants/DynamicistToolKit/pull/28
Note that your equation above is slightly incorrect, it should be:
w0corrected = wo / (sqrt(2) - 1)^(1/2n) = w0 / 0.802 (when n=2)
Ok, I now get virtually identical results with these changes:
That's good enough agreement in results! I assume you're using the conventional digital Butterworth filter, which eliminates the need for a C version. As long as you're interpolating the raw data onto a fixed sampling rate. Which is OK if the gaps are small enough.
And yes, I had my correction factor backwards, thanks for the correction.
I assume you're using the conventional digital Butterworth filter
Yep, I'm traditional like that :)
which eliminates the need for a C version.
I'll implement the C version of your filter if I get an itch. It is a more elegant solution.
Which is OK if the gaps are small enough.
Your filter actually had issues with large marker gaps. The data in #132 exposed this. I assume there has got to be a limit to the marker gap that makes either method fail.
Addresses issues #30 and some of #49.
The Octave file
leg2d.m
was quite slow. It takes from 5 to 6 minutes on my machine to compute the inverse dynamics of a 8 minute (100 Hz) chunk of data. This has bothered me a long time because it takes hours to process all of our data when it shouldn't. So I decided to finally fix it.I decided not to spend time improving the performance of the
leg2d.m
and its dependent functions because I'd rather purge the Octave stuff from this library (the octave/python bridge has too much overhead and complicates installation especially on Windows).So I've simply ported the
leg2d.m
file to a Python function ingait.py
and also switched to using library filters (scipy.signal) and the finite difference functions in DynamicistToolKit for speed gains. From the simple port I'm seeing close to a 1000 fold performance increase. It now takes less than a second to process 8 minutes of data.lower_extremity_2d_inverse_dynamics()
into several smaller functions or convert to a class.