visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.56k stars 76 forks source link

[Bug]: fastdup.search error: Failed to merge similarity/outliers with atrain_features.dat.csv file #184

Closed roi2405 closed 1 year ago

roi2405 commented 1 year ago

What happened?

After using fastdup.search, using fastdup.create_duplicates_gallery raised the error specified in the title.

What did you expect to see?

An html file that contains the duplication gallery.

What version of fastdup were you runnning on?

0.925

What version of Python were you running on?

Python 3.10

Operating System

google colab

Reproduction steps

Search an image in an image gallery. Create a duplicates gallery with the dataframe you get from the search function.

Relevant log output

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/fastdup/__init__.py", line 1064, in create_duplicates_gallery
    ret = do_create_duplicates_gallery(similarity_file, save_path, num_images, descending, lazy_load, get_label_func, slice, max_width, get_bounding_box_func,
  File "/usr/local/lib/python3.10/dist-packages/fastdup/galleries.py", line 333, in do_create_duplicates_gallery
    df = merge_with_filenames(df, filenames)
  File "/usr/local/lib/python3.10/dist-packages/fastdup/utils.py", line 322, in merge_with_filenames
    assert len(df), "Failed to merge similarity/outliers with atrain_features.dat.csv file"
AssertionError: Failed to merge similarity/outliers with atrain_features.dat.csv file

Attach a screenshot [Optional]

‏‏לכידה4

The relevant code is: fastdup.init_search(5, work_dir=work_dir, license='XXXXXX') df = fastdup.search(image_to_search, None, verbose=True) fastdup.create_duplicates_gallery(df, ".", input_dir=input_dir, work_dir=work_dir)

The error occurs after executing fastdup.create_duplicates_gallery(df, ".", input_dir=input_dir, work_dir=work_dir). The error message is specified in the relevant log output above.

Contact Details [Optional]

roi2405@gmail.com

dbickson commented 1 year ago

Hi @roi2405 it seems you are mixing v0 with v1 code. Fastdup run should be run with v0 code. Namely run withfastdup.run(input_dir='...', work_dir='...', turi_param='ccthreshold=0.9'). We will fix the code to support v1 as well.

dbickson commented 1 year ago

Fixed now, will be released in 0.927