DeepLabCut / napari-deeplabcut

a napari plugin for labeling and refining keypoint data within DeepLabCut projects
GNU Lesser General Public License v3.0
52 stars 22 forks source link

Unable to load h5 from napari into DLC #30

Closed GxHam closed 1 year ago

GxHam commented 2 years ago

Hi,

Thank you so much for creating napari. It really speeds up the labeling of images before DLC training. However, it seems that the h5 file produced cannot be read by DLC. I realized the issue might be that the DLC h5 contains the folder names, but the h5 produced by napari only has the file names (see snapshot fo csv below). Is there a way to correct this?

Left h5 from DLC, right h5 from napari. image

Also just opening a folder containing a .h5 from DLC does not seem to work properly. I need to open the h5 separately as a file before the labels from the DLC gui can appear in napari. Might be the same issue as above though.

Regards, Gao Xiang Ham

GxHam commented 2 years ago

For anyone else encountering this issue, I've worked around this by manually adding the folder-related columns into the csv from napari, and converting it back into a .h5 using deeplabcut.convertcsv2h5(config_path, userfeedback=True).

ablot commented 1 year ago

The issue seems to still be present. As mentioned here: https://forum.image.sc/t/deeplabcut-2-3rc1-a-extract-frame-b-not-backward-compatible-c-create-training-data-fails/73520, it appears to come from the write_hdf. There is comment L21 https://github.com/DeepLabCut/napari-deeplabcut/blob/ee44e0fb55ee26f824d18d47711259f44a565cd7/src/napari_deeplabcut/_writer.py#L21

jeylau commented 1 year ago

@ablot let me take another look at it and I'll get back to you!

ablot commented 1 year ago

The fix from @GxHam still works.

So it seems that the napari csv/hdf files are missing the initial labeled-data column and the second column with the folder name.

jeylau commented 1 year ago

@ablot, what's the version of napari-deeplabcut that you're using? I cannot reproduce the error.

ablot commented 1 year ago

I followed the instruction of this page: https://deeplabcut.github.io/DeepLabCut/docs/PROJECT_GUI.html and used DLC 2.3rc3. This has installed napari-deeplabcut 0.0.8

Should I update either of them?

jeylau commented 1 year ago

That looks right to me. Could you please describe to me simple steps that cause the bug? I tried to load both old and new data files, and they both are written with the right number of columns; there must be something I'm missing.

ablot commented 1 year ago

I've seen the new test_reader and but I think my issue is with the writer. If I label data with the GUI and Ctrl+S, the output csv is lacking the the first two columns and the h5 file looks like that: image

Is that expected? The index is a MultiIndex but with a single level (file name). That crashes in the next step, when DLC tries to access the level[1] to format the path (which is not saved by my version of the GUI)

jeylau commented 1 year ago

The new tests for the reader are unrelated; I did test the writer. Do you label the data from scratch, or are you loading past annotations?

ablot commented 1 year ago

I started from scratch from a fresh install yesterday. I've just checked with a brand new project (created and extracted via the GUI) and I get the same output after saving labels (df.index is a multiindinex with just the file name)

jeylau commented 1 year ago

That's odd; I really cannot replicate it. I'll write tests for the writer plugin, so maybe we can understand what's wrong on your machine 😕

ablot commented 1 year ago

Just to be sure, what is the expected output? The dataframe saved as h5 file should have a multiindex like: ('labeled-data', parent_folder, file name) or is that the old style and it should just be file name?

The current version of the write uses: df.index = [meta["paths"][i] for i in df.index], which for me are just the file names.

jeylau commented 1 year ago

Yes, you're correct: a tuple like ('labeled-data', parent_folder, file name). What's intriguing to me is that it seems to fail here https://github.com/DeepLabCut/napari-deeplabcut/blob/e0caf38eeb1d09ef3c7dfcf49252ae92717e907a/src/napari_deeplabcut/_reader.py#L92 When starting to label, you're loading a folder nested within labeled-data, right?

ablot commented 1 year ago

Ah that's it. For some reason filepath has a dirty mixture of filesep (I'm on a windows computer). Example filepath parsed in _reader.py: C:/Users/blota/test-2022-12-02/labeled-data/eye_cam_2022-11-25T15_38_34\img22330.png

ablot commented 1 year ago

Replacing L92 by: relpath = Path(filepath).parts[-3:] seems to fix the issue if you are fine with using pathlib.

jeylau commented 1 year ago

Ah that's it. For some reason filepath has a dirty mixture of filesep (I'm on a windows computer). Example filepath parsed in _reader.py: C:/Users/blota/test-2022-12-02/labeled-data/eye_cam_2022-11-25T15_38_34\img22330.png

That's really strange; great catch though! Your fix looks good! Do you want to PR it yourself or you don't mind me doing it?

ablot commented 1 year ago

No it's fine you can add it. It'll be faster that way.

jeylau commented 1 year ago

Thanks a lot for helping me troubleshoot!

jeylau commented 1 year ago

Fixed with 23463dedc0bbd58be052b51ea48bb1d8140719bf