DrCoffey / DeepSqueak

DeepSqueak v3: Using Machine Vision to Accelerate Bioacoustics Research
BSD 3-Clause "New" or "Revised" License
375 stars 89 forks source link

Individual Tonalities (Take Two) #125

Closed gcalongi closed 3 years ago

gcalongi commented 3 years ago

Reinstated tonality slider; allowed tonality to be adjusted on an individual call basis; created dialog to apply global tonality/amplitude settings when a new detections mat is loaded; adjusted how power is calculated and communicated to the user.

DrCoffey commented 3 years ago

Hey Gabi,

When I download your fork and try to load one of the detection files I get the following error:

Error using tabular/dotParenReference (line 76) Unrecognized table variable name 'EntThresh'. Error in initialize_display (line 101) if handles.data.calls.EntThresh(1) == 0 && handles.data.calls.AmpThresh(1) == 0 Error in loadcalls_Callback (line 28) initialize_display(hObject, eventdata, handles); Error in gui_mainfcn (line 95) feval(varargin{:}); Error in DeepSqueak (line 29) gui_mainfcn(gui_State, varargin{:}); Error in matlab.graphics.internal.figfile.FigFile/read>@(hObject,eventdata)DeepSqueak('loadcalls_Callback',hObject,eventdata,guidata(hObject)) Error while evaluating UIControl Callback.

The variables don't exist yet, so they need to be initialized for older files that don't contain them.

DrCoffey commented 3 years ago

Hey Gabi, this looks great, but to merge this request with the main branch I would like to see the following changes.

  1. Remove the power warning from the opening sequence. This can go in the documentation.
  2. When loading an old detection file (with no individual thresholds), don't ask if global settings should be applied, just do it, save the file and put up a message that the file has been updated.
  3. Make sure newly detected files include the global thresholds for Ent and Amp, so #2 doesn't occur when you load a newly detected file.
gcalongi commented 3 years ago

I believe I have addressed all of your concerns! Let me know if you notice any further issues or want any additional adjustments.

DrCoffey commented 3 years ago

Hey Gabi,

When I load a detection file DeepSqueak displays every single call in reverse order, which is very slow. Is this happening on your end too?

-Kevin

gcalongi commented 3 years ago

Yes - this was originally intentional on my part, as it was the quickest way (programmatically speaking) for me to apply the global entropy & amplitude thresholds when loading an old file. On my end this process is not prohibitively slow so I never changed it (and it shouldn't happen at all once files have been updated to the new format), but since my understanding of the code has improved, I went ahead and adjusted it - this should all happen behind the scenes now with a progress bar so the user knows something is busy. Should be quicker now!

DrCoffey commented 3 years ago

It now seems to be caught in an infinite loop in calculate stats. I think this can happen when iteratively lowering the threshold, but I thought I had fixed this. Is there a reason you are running calculate stats on each call when loading a file? My thought was that you would just add 2 variables (tonality thresh & amp thresh) to the detection file and set them to the default values. Then you only have to run calculate stats when you are actually trying to display a call or change the slider. I can take a closer look at this code when I get a chance, but it might take me a little while. I would like to add this feature to the main branch, but I might end up coding it myself so I really understand what is happening.

gcalongi commented 3 years ago

Yes I noticed there was a merge back on Jul 23 that eliminated the break in CalculateStats if iter > 10, and I was wondering about that. I imagine that's where you're getting stuck.

The reason I run when loading an old file is to automatically get all the loaded calls on the same page, statistically speaking. The way it used to be, as soon as the focus window pulls up that very first loaded call, it automatically applies a default Entropy and Amplitude Threshold, even if the user is just scrolling through detections and not intending to change anything themselves. This meant that only calls that got manually viewed had these default global settings applied, and any calls that never got focused would be stuck with the original calculations which did not include any applied thresholds. This seemed misleading to me, and you end up with a Calls table that has some calls calculated one way and some another, and if that structure gets saved and then used for post-processing, you end up comparing apples to oranges. Therefore, I thought it made more sense to only change the Calls table when the user makes a change (by adjusting the Box or by adjusting the slider or applying different thresholds through the menu), but in that case all the calls need to start with those thresholds applied as soon as they are first loaded from an old file.

Furthermore, if I'm going to introduce Entropy and Amplitude Thresholds as values that can vary by individual call, then all the calls need to start somewhere anyway, and the only way to pull out the variables that the Entropy and Amplitude thresholds affect is by then running CalculateStats on all incoming calls.

This won't matter in the future since this is now integrated into SqueakDetect and those default thresholds will be applied at the call classification stage. This just basically retroactively applied the Entropy and Amplitude thresholds to files that weren't run that way and therefore need to have their calls standardized.

DrCoffey commented 3 years ago

@gcalongi I see. In the new version all statistics were removed from the detection file and it really shouldn't be used for analysis outside of DeepSqueak. The call statistics are generated on-demand using calculate_stats, either for display, clustering, or creating a stats output. Those output "excel" files should be used for post-hoc analysis. So ideally the new per-call thresholds would just sit in the detection file as default until they are accessed for display. I'll try to look over your implementation more clearly in the coming days and get back to you.

gcalongi commented 3 years ago

@DrCoffey Stripping away the power stuff is fine, but the EntThresh and AmpThresh still have to be recorded for every individual call in order to save those individualized settings and use them for post-processing steps like the clustering method which use the contours based on the tonality and amplitude levels. Since they are different for every call in every file, they need to tag along with the calls in the detections.mat files and not just be saved in general settings, which only contain one global value each for tonality and amplitude.

Also, I believe the hang-up getting caught in the "lowering threshold" loop was actually do a dim issue where the max(I) was taken to get the max at each time point, but occasionally I's first dimension is 1, which caused amplitude to contain a single value (overall max), so I fixed that in an additional commit and that will hopefully help avoid getting caught in that loop.

DrCoffey commented 3 years ago

Hey Gabi, I have been working on this a bunch today and found a number of little issues with the power situation (like adding new boxes and merging old and new files) which are now fixed.

More importantly, I have also decided I am not going to add individual tonality adjustment to the main branch. Instead I am going to focus on optimizing automatic contour extraction. Here are the two main reasons:

  1. We often work with datasets that include 10's of thousands of calls. It isn't practical to manually adjust that many thresholds. Adding a manual step will slow analysis down dramatically.

  2. We are about to start work on real-time detection and classification. For classification to happen in real-time the contour extraction must be automatic, as calls can be produced at a rate of several per second. Having a manual step will kind of defeat the purpose.

In the end the goal is the same, to have the best possible contour extraction for each call, but I think I have found a method to do it automatically and I will be sure to share it with you as soon as I'm ready!

gcalongi commented 3 years ago

@DrCoffey Hey again - I went ahead and stripped out the individual tonalities stuff while preserving some small tweaks that came out of the power calculation clean-up, some formatting, and a bug in CalculateStats regarding the amplitude variable. If you're okay with these final minor issues and accept this final pull, then that will help me keep my master on-pace with yours and I can keep the individual tonalities capability on a separate branch in my repository.

Let me know if there's anything else you want me to do on this front, otherwise I'll just look forward to whatever you end up doing with the new contouring!

DrCoffey commented 3 years ago

@gcalongi I'll take a look at this and see what is different. I think I caught that same amplitude issue and have also fixed a number of small bugs recently. If this isn't easy to merge, perhaps we can do a quick zoom chat and I can just implement your fixes directly to the master. I'll let you know early next week.

DrCoffey commented 3 years ago

Hey Gabi, I looked through all of the edits and I have them fixed on my end. I am going to close this pull request because I don't want to merge the two versions that have similar but slightly different fixes. Check out the current Master and see if I missed anything you changed.