visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.56k stars 76 forks source link

[Bug]: Labels not loaded into gallery view #185

Closed dnth closed 1 year ago

dnth commented 1 year ago

What happened?

I ran the sample notebook to analyze an image classification dataset and found that the labels are not loaded into the similarity gallery view.

Here's a screenshot of the error image

However, fd. similarity () returns the labels. image

What did you expect to see?

Labels are shown in the gallery view.

What version of fastdup were you runnning on?

0.926

What version of Python were you running on?

Python 3.10

Operating System

Google Colab

Reproduction steps

Run https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/analysing-image-classification-dataset.ipynb

Relevant log output

No response

Attach a screenshot [Optional]

No response

Contact Details [Optional]

No response

dbickson commented 1 year ago

Hi @dnth I think you need to add label_col='label' to signal the report you want to include the label as well, please try it out and let us know if this works

dnth commented 1 year ago

I ran fd.vis.similarity_gallery(label_col='label')

and see another error

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/fastdup/__init__.py", line 2300, in create_similarity_gallery
    ret = do_create_similarity_gallery(similarity_file, save_path, num_images, lazy_load, get_label_func,
  File "/usr/local/lib/python3.10/dist-packages/fastdup/galleries.py", line 1777, in do_create_similarity_gallery
    df = find_label(get_label_func, df, 'from', 'label', kwargs)
  File "/usr/local/lib/python3.10/dist-packages/fastdup/galleries.py", line 104, in find_label
    assert False, f"Found str label {get_label_func} but it is neither a file nor a column name in the dataframe {df.columns}"
AssertionError: Found str label label but it is neither a file nor a column name in the dataframe Index(['distance', 'from', 'to'], dtype='object')
-1
dnth commented 1 year ago

I notice this happens not only to similarity gallery but all other galleries as well

dnth commented 1 year ago

@dbickson I'm on v0.928 and I'm still encountering this

image

image

dbickson commented 1 year ago

Fixed in 0.930

dbickson commented 1 year ago

Should be fixed on 1.3 on mac and 1.4 on ubuntu