DrCoffey / DeepSqueak

DeepSqueak v3: Using Machine Vision to Accelerate Bioacoustics Research
BSD 3-Clause "New" or "Revised" License
373 stars 89 forks source link

mean power (dB/Hz) #173

Closed TaxiCalbee closed 2 years ago

TaxiCalbee commented 2 years ago

Hi, I've recently gotten some behavioral data analyzed via DeepSqueak from a colleague showing genotype differences in mean power. From my understanding, the software creates a contour based on vocalization onto a graph that is frequency vs. time, with power (in dB) shown in kind of a heat-map way, where certain colors indicate certain decibel values, put onto the contour on this same graph.

Is the mean power the mean value of decibels across each point of the contour? Also, the values my colleague got were negative dB/Hz, does that mean the vocalizations are reducing in decibels as time goes on? Somehow I'm having some problem wrapping my head around this concept, so any insight into what mean power is and what the biological significance of this parameter would be is greatly appreciated!

TaxiCalbee commented 2 years ago

I just read an older post asking why the mean power is negative, and I think I understand that it's due to microphones being assigned a max reference value of dB at 0 dB? Therefore, sounds that are -5 dB would be louder than those that are -55 dB?

VoiceScientist commented 2 years ago

Yes, -5 dB is 50 dB greater than -55 dB. Decibels are always related to some reference level. in recording systems, 0 dB (https://en.wikipedia.org/wiki/DBFS) is the maximum amplitude that can be obtained, so the dB will always be less than 0; if you have a value of 0, then you've maxed out your system and likely have some clipping of the signal.

The second part of your question is trickier: what is the biological significance of the amplitude of the signal? Physiologically, the amplitude is somewhat related to subglottal air pressure. Riede has an exquisite in vivo paper that measured subglottal pressure during USVs but didn't explicitly examine amplitude: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3214115/

However (and this is important), the recorded amplitude is highly dependent on non-physiologic experimental variables, such as the distance and angle the animal is from the microphone during the recording (which varies, assuming your animals are free to move during the experiment), the microphone properties, the gain on the mic amplifier, and possibly settings in your recording software. All these variables can be somewhat controlled. At the very least, I recommend having a consistent mic placement and calibrating your mic before each recording - and definitely calibrating if you use multiple mics. I use Avisoft's calibrator: https://www.avisoft.com/playback/calibrated-40-khz-reference-signal-generator/ To account for movement, I recommend looking at average values of a large sample of USVs, not individual USVs. Depending on your experimental groups, you might be able to assume they all move around to the same degree (better yet, you could track them during the experiment with video recording).

TaxiCalbee commented 2 years ago

Thanks, I follow the negative dB. But I am still confused about what is DeepSqueak's output of 'mean power (dB/Hz)', if it was amplitude (i.e. intensity, volume, loudness) I would expect it to be in dB units not dB/Hz, and it was referred to in another post as power spectral density. Does the unit dB/Hz mean that the measure is decibel divided by frequency of the USV contour at each point, taking the mean over the contour? In this case, could someone explain whether this measure has a simple real world lay person meaning, as it is not simple amplitude. Or does the unit dB/Hz indicate that the power measure was normalized by the spectral resolution employed to digitize the signal, in which case it is a measure of amplitude?

VoiceScientist commented 2 years ago

I had the same question a few years ago (#17) and @MxMarx explained that it was a measure of mean power of the USV. My understanding of why it is the power spectral density (PSD) and measured in dB/Hz (not just dB) is that the PSD is calculated from the spectrum and takes into account the spectral resolution of the FFT. Here is an explanation I found to be helpful: https://community.sw.siemens.com/s/article/what-is-a-power-spectral-density-psd

TaxiCalbee commented 2 years ago

I see, so the Hz in dB/Hz is not actually the frequency of the mouse vocalizations, but rather the frequency resolution that the microphone is recording from? If I'm understanding correctly, the mean power (dB/Hz) is normalizing the sound amplitude of the vocalizations to the frequency resolution of the recording microphone which is important because the same sound can result in different amplitudes if the frequency resolutions are different (as seen in this link: https://community.sw.siemens.com/s/article/what-is-a-power-spectral-density-psd). Would that be the right interpretation? Is there any scenario where the Hz in dB/Hz would be the frequency of the mouse vocalizations?

Thanks for the responses, they have been very helpful!

VoiceScientist commented 2 years ago

I think the "Hz" in the dB/Hz comes from the frequency resolution of the spectrogram creation (in DeepSqueak) and has nothing to do with the USV itself or the hardware or recording settings. It is a function of the acoustic analysis - the fast fourier transform (FFT) within DeepSqueak. I'm operating at the edge of my knowledge here, so if anyone else has thoughts/clarifications, I'd welcome them!

DrCoffey commented 2 years ago

@VoiceScientist & @MxMarx have done a better job explaining it than I ever could! Practically speaking, if you are looking at calls recorded on the same microphone with the same settings, louder calls have a higher power even if negative.

I don't put a lot of stock in the power calculation though. Animals could just be closer to a microphone or aiming directly at it, and it would seem like they were producing louder calls. I think it is a little too hard to control.

VoiceScientist commented 2 years ago

@DrCoffey I agree you have to take the power calculation with a grain of salt. I investigated that issue in the past using video tracking and I found the distance to the mic in a small enclosure (standard sized rat housing cage) didn't make much difference, but the orientation (angle to the mic) had more of an influence. USVs are highly directional. I justify comparing group differences in amplitude by averaging a large number of USVs from each animal. That said, your caveat is warranted.

TaxiCalbee commented 2 years ago

The number of USVs recorded per day in our dataset is in the mid hundreds, so hopefully that would reduce differences due to variability in orientation to the mic. We also see the group differences particularly during early postnatal days (P1-5), which is when I expect the pups to not move very much. I'll have to confirm with my colleague about the orientation, whether that was controlled for at the start of the recordings or not, and whether it changed during.

Yes, the power is just one parameter where we found group differences. There were others as well which is nice because it makes me more confident that there is a genuine difference. At any rate, now I definitely have a better understanding of "mean power" and thanks a lot to @VoiceScientist and @DrCoffey for your help!

DrCoffey commented 2 years ago

Ya, I think with enough animals/vocalizations you can get around the variability issue. Good luck with the rest of the analysis!