DrCoffey / DeepSqueak

DeepSqueak v3: Using Machine Vision to Accelerate Bioacoustics Research
BSD 3-Clause "New" or "Revised" License
371 stars 89 forks source link

Export Call stats different than displayed stats #157

Closed cmlenell1 closed 2 years ago

cmlenell1 commented 2 years ago

Hi,

I noticed that call length and principal frequency are slightly different between the displayed numbers and the exported numbers. For example this last call was 10 ms with principal frequnecy of 105.8 but the exported stats has duration as 12.8 ms and prin freq as 97.86. How do I fix this?

Different call stats
DrCoffey commented 2 years ago

Hey, this is actually a much trickier issue than it seems. The call stats you see are calculated from the contour, which is in turn generated from the spectrogram. Any change in how the spectrogram is rendered will slightly change the contour and thus the output stats. I wanted the visualization of the spectrogram to be flexible, but when it comes time to output the stats, we have to use the same spectrogram settings for every single call so that the stats are generated fairly. So in short, the output stats all use a single set of spectrogram settings that are good for most calls. I want to find a better way to do this, but it is difficult and I haven't had the time. The numbers should be highly correlated though.

-Kevin

cmlenell1 commented 2 years ago

Hi again, I think something is wrong with the new version of Deepsqueak. Randomly the sinuosity of a call will be extremely off. The sinuosity is during analysis ~1.6 but the export was ~7.0. On the previous version it exported correctly. We can't achieve reliability anymore because of this issue. There's no patterns to when this occurs. image

DrCoffey commented 2 years ago

Ok, I will take a really close look at the exporting code. It should be using the exact same function as the display code, but with slightly different spectrogram parameters. I'm not sure why that would cause such big swings in the output, so there may be something going on that I am missing. If you could share the exact file where you are seeing this it might expedite the process.

DrCoffey commented 2 years ago

Ok, I think I figured out what is happening. Sinuosity is calculated as the XY path length the call travels, as if frequency and time samples were positions on a square grid. Path length is then normalized to the number of samples:

    D = pdist([stats.ridgeTime' stats.ridgeFreq_smooth],'Euclidean');
    Z = squareform(D);
    leng=Z(1,end);
    c=0;
    for ll=2:length(Z)
        c=c+1;
        totleng(c)=Z(ll-1,ll);
    end
    stats.Sinuosity=sum(totleng)/leng; 

This means that calls with rapid changes in frequency have high sinuosity. If the contour misses a single point during a frequency swing it can dramatically lower the total path length of the call while only lowering the sample length by 1. Here is an example where missing 2 samples cuts sinuosity in half.

Before: before

After: After

I think there is also too much smoothing happening to the contour. We have been working on improving the contour extraction and I think the over-smoothing was left in as an oversight. I will fix this soon, and hopefully improve the overall contour extraction in the near future so this isn't a persistent issue.

Although this isn't ideal, at least the contour extraction and sinuosity is being applied with identical parameters to all calls, so it should still be fine for between group comparisons. It will just underestimate sinuosity on some calls that it can't extract completely.

cmlenell1 commented 2 years ago

Thanks for the quick response. The problem isn't too low of sinuosity. It's too high for exported calls. For example, call 207 has a sinuosity of 1.6 but it exported at 7.8. See below Capture

DrCoffey commented 2 years ago

Ya, I think the exported version is picking up the downward swing, increasing the call path a lot. I'll debug exporting function next to make sure, it could be something else.

DrCoffey commented 2 years ago

@cmlenell1 Ok, I figured out a method to optimize spectrogram parameters for all of the different functions. The display stats, output stats, clustering stats, and batch-reject-by-threshold stats will all use the same automatically optimized spectrogram parameters. You will still be able to change the display parameters, but they wont effect the contour or stats.

I need another day or 2 to test everything, but I should get the new version pushed by Monday.

I have been meaning to deal with this for a long time, so thanks for the motivation!

DrCoffey commented 2 years ago

@cmlenell1 Alright, I pushed the fix. Let me know if you are still having problems. If not I'll close the issue.

cmlenell1 commented 2 years ago

I took a quick look and it looks alot better! I'll take a closer look on Monday. Thanks for prioritizing this!

cmlenell1 commented 2 years ago

The numbers look good! Thanks again!!

cmlenell1 commented 2 years ago

Okay, thank you!!

Charles Lenell, PhD, CCC-SLP

On Thu, Mar 10, 2022 at 4:18 PM DrCoffey @.***> wrote:

Hey, this is actually a much trickier issue than it seems. The call stats you see are calculated from the contour, which is in turn generated from the spectrogram. Any change in how the spectrogram is rendered will slightly change the contour and thus the output stats. I wanted the visualization of the spectrogram to be flexible, but when it comes time to output the stats, we have to use the same spectrogram settings for every single call so that the stats are generated fairly. So in short, the output stats all use a single set of spectrogram settings that are good for most calls. I want to find a better way to do this, but it is difficult and I haven't had the time. The numbers should be highly correlated though.

-Kevin

— Reply to this email directly, view it on GitHub https://github.com/DrCoffey/DeepSqueak/issues/157#issuecomment-1064562568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK23P6NJXVFKXBNRLXJMCWDU7JYJ5ANCNFSM5PH4P55Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>