visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.56k stars 76 forks source link

outliers() is missing outlier filename column #154

Closed dbickson closed 1 year ago

dbickson commented 1 year ago

When I tried to list the outliers, some error pops up ("AttributeError: 'DataFrame' object has no attribute 'img_filename_outlier'"). I attached the screenshot in the attachment.

I am using version 0.917. It is clear that there is no "img_filename_outlier" in the outlier_df dataframe... (outlier_df = fd.outliers()) So I think that might be a compatibility problem?

Only "outlier" "nearest" "distance" "index" "filename_nearest" "error_code_nearest" "is_valid_nearest" are found.

Thanks

Best,

Zilun

Screenshot from 2023-04-20 00-11-31

dnth commented 1 year ago

I can confirm having this issue on fastdup version '0.921'

dbickson commented 1 year ago

The solution is to change img_filename_outlier to filename_outlier. @dnth can you please update the docs accordingly?

dnth commented 1 year ago

@dbickson with the current version (0.921) fd.outliers() does not produce a DataFrame with the name filename_outlier nor img_filename_outlier.

Changing img_filename_outlier to filename_outlier would also result in the same error.

image

dnth commented 1 year ago

In version 0.906, fd.outliers() produces a DataFrame with the img_filename_outlier column

image

dbickson commented 1 year ago

Thanks @dnth for clarifying! We will fix this

dbickson commented 1 year ago

Fixed in 0.922