sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
272 stars 137 forks source link

Certain filename patterns prevent importing SLEAP H5 files #361

Open NSGregory opened 1 month ago

NSGregory commented 1 month ago

Describe the bug simba.utils.errors.NoFilesFoundError: SIMBA NO FILES FOUND ERROR: SimBA could not locate a video file in your SimBA project for data file TPT_TOP.v001.016_G1F3_G1F4_cfa_2wk.analysis

Filenames with the underscore character ("_") in the name of the SLEAP project generate the above error after importing the videos and trying the import the SLEAP H5 video

To Reproduce Steps to reproduce the behavior:

  1. Import videos with "_" in the filename for the project
  2. Attempt to import the analysis files from SLEAP

Desktop (please complete the following information):

"TPT_TOP.v001.016_G1F3_G1F4_cfa_2wk.analysis" does not work TPTTOP.v001.016_G1F3_G1F4_cfa_2wk.analysis works perfectly

Presumably this is related to the fact that SLEAP separates the project name from the individual video names with a "_"

sronilsson commented 1 month ago

Hi @NSGregory! Thanks for reporting - I have to check this.. but if you have multiple animals, then we need to pair each track with the correct animal across videos, so SimBA needs to bring up the video associated with your imported SLEAP data file.

The code looks in the project_folder/videos directory for the video file associated with the data file being imported, using the file-names to pair them. Do you have a video file in project_folder/videos named TPT_TOP or TPTTOP?

NSGregory commented 1 month ago

The videos were in the correct location when I ran into the issue

The video name is G1F3_G1F4_cfa_2wk.mp4 The project name is TPT_TOP.v001.slp

The naming convention for the H5 file appears to be {project name}.{video number in project}_{video name)

So actually changing the H5 file to the wrong name made it work.

sronilsson commented 1 month ago

Ah got it - thanks, let me try this out.

sronilsson commented 1 month ago

And yes - I can see the issue in the logic in the SLEAP filename cleaning function that tries to tease out the video name HERE, when underscores in the project name.

>>> clean_sleap_file_name("TPT_TOP.v001.016_G1F3_G1F4_cfa_2wk.analysis")
'TOP.v001.016_G1F3_G1F4_cfa_2wk'
>>> clean_sleap_file_name("TPTTOP.v001.016_G1F3_G1F4_cfa_2wk.analysis")
'G1F3_G1F4_cfa_2wk'
sronilsson commented 1 month ago

@NSGregory what do you think of something like instead, would it work on your end or can you see any fallacies?

def clean_sleap_file_name(filename: str) -> str:
    if (".analysis" in filename) and ("_" in filename) and (filename.count('.') >= 3):
        filename_parts = filename.split('.')
        video_num_name = filename_parts[2]
        if '_' in video_num_name:
            return video_num_name.split('_', 1)[1]
        else:
            return filename
    else:
        return filename

It works on my test cases and your video file names, but admittently I have not see a lot of SLEAP h5 file names and may be some other cases I don't know about.

sronilsson commented 1 month ago

Hi @NSGregory - when you get a chance, if you update simba with pip install simba-uw-tf-dev --upgrade, how does the import look on your end with the file names it previously struggled with?

NSGregory commented 1 month ago

Hi Simon, I am traveling, but I will give it a try as soon as I can. Thanks for the quick response.

On Fri, May 17, 2024 at 7:23 AM Simon Nilsson @.***> wrote:

Hi @NSGregory https://github.com/NSGregory - when you get a chance, if you update simba with pip install simba-uw-tf-dev --upgrade, how does the import look on your end with the file names it previously struggled with?

— Reply to this email directly, view it on GitHub https://github.com/sgoldenlab/simba/issues/361#issuecomment-2117725478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPZ6XWOSMVF7B6GGTJ5NQ3ZCYHGHAVCNFSM6AAAAABHY6SMT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXG4ZDKNBXHA . You are receiving this because you were mentioned.Message ID: @.***>

NSGregory commented 1 month ago

The change works for loading the tracking data. I got a new warning when starting up simba though, not sure if it's related. Didn't seem to impact function at all but I didn't try any other functions yet.

(simba) C:\Users\Nick>simba C:\Users\Nick\anaconda3\envs\simba\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: C:\Users\Nick\anaconda3\envs\simba\lib\site-packages\numpy\.libs\libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll C:\Users\Nick\anaconda3\envs\simba\lib\site-packages\numpy\.libs\libopenblas.WCDJNK7YVMPZQ2ME2ZZHJJRJ3JIKNDB7.gfortran-win_amd64.dll stacklevel=1)

sronilsson commented 1 month ago

Thanks @NSGregory - if you hit some error let me know, or try in a fresh conda environment. I pinned a lot of dependency versions yesterday (including numpy which the warning is about) so there may be duplicate versions installed in your python environment for some reason.

NSGregory commented 1 month ago

A clean install fixed the warning.