visual-layer / fastdup

fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
Other
1.6k stars 77 forks source link

Cannot pass `get_label_func` in `fd.vis.duplicates_gallery` #96

Closed dnth closed 1 year ago

dnth commented 1 year ago

Python version - 3.10 fastdup version - 0.214 OS - Ubuntu 20.04

I'm trying to pass in get_label_func to fd.vis.duplicates_gallery

fd.vis.similarity_gallery(get_label_func=lambda x: x.split('/')[-2])

I encountered the following error

Traceback (most recent call last):
  File "/home/dnth/anaconda3/envs/fastdupv1/lib/python3.10/site-packages/fastdup/sentry.py", line 114, in inner_function
    ret = func(*args, **kwargs)
  File "/home/dnth/anaconda3/envs/fastdupv1/lib/python3.10/site-packages/fastdup/fastdup_visualizer.py", line 224, in similarity_gallery
    create_similarity_gallery(df_sim, work_dir=self._controller.work_dir, save_path=save_dir, lazy_load=lazy_load,
TypeError: fastdup.create_similarity_gallery() got multiple values for keyword argument 'get_label_func'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[43], line 1
----> 1 fd.vis.similarity_gallery(get_label_func=lambda x: x.split('/')[-2])

File ~/anaconda3/envs/fastdupv1/lib/python3.10/site-packages/fastdup/sentry.py:120, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
    118 except Exception as ex:
    119     fastdup_capture_exception(f"V1:{func.__name__}", ex)
--> 120     raise ex

File ~/anaconda3/envs/fastdupv1/lib/python3.10/site-packages/fastdup/sentry.py:114, in v1_sentry_handler.<locals>.inner_function(*args, **kwargs)
    112 try:
    113     start_time = time.time()
--> 114     ret = func(*args, **kwargs)
    115     fastdup_performance_capture(f"V1:{func.__name__}", start_time)
    116     return ret

File ~/anaconda3/envs/fastdupv1/lib/python3.10/site-packages/fastdup/fastdup_visualizer.py:224, in FastdupVisualizer.similarity_gallery(self, save_path, label_col, draw_bbox, num_images, max_width, lazy_load, slice, ascending, threshold, show, load_crops, **kwargs)
    222 # create gallery
    223 jupyter_html = 'JPY_PARENT_PID' in os.environ and show
--> 224 create_similarity_gallery(df_sim, work_dir=self._controller.work_dir, save_path=save_dir, lazy_load=lazy_load,
    225                           get_bounding_box_func=self._get_bbox_func() if draw_bbox and not load_crops else None,
    226                           get_label_func=self._get_label_func(label_col),
    227                           num_images=num_images, max_width=max_width, threshold=threshold,
    228                           id_to_filename_func=self._get_filneme_func(load_crops), descending=not ascending,
    229                           get_display_filename_func=self._get_disp_filneme_func(),
    230                           get_extra_col_func=None, jupyter_html=jupyter_html, slice=slice, **kwargs)
    231 self._clean_temp_dir(save_dir, html_src_path, html_dst_path, lazy_load=lazy_load)
    232 if show:

TypeError: fastdup.create_similarity_gallery() got multiple values for keyword argument 'get_label_func'
dbickson commented 1 year ago

The new api v1 allows settings a command line argument named annotations where you send a pandas dataframe with a column named "label" per each image (column is img_filename). This is a bit of a change vs. the previous get_label_func. Let us know if this is usable or you would like to see something else?

dnth commented 1 year ago

Is there a sample code of how to get the labels using the v1 API?

dbickson commented 1 year ago

Here is the example, look for annotations https://visual-layer.readme.io/docs/analyzing-labeled-images Let us know if it is not clear

dbickson commented 1 year ago

Closing this one. get_label_func() can be still used via 0.2 API https://visual-layer.readme.io/docs/v02xx-api