visual-layer / fastdup

fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Other
1.52k stars 74 forks source link

[Bug]: Confusing Warning & Info Message #257

Closed ohade closed 10 months ago

ohade commented 10 months ago

What happened?

1. Warning on none existing tar/zip files. Believe this is caused by the video files

pic1 pic2 pic3

--------------------------------------------------------------------------------

2. Calculation in the Summary report1:

image


Explain:

a.1. If 100% is 11,678 pictures. how can 88% be 11,678. a.2. If 88% are valid, why the rest 12% aren't invalid (showing 0%).
b. 3,799 out of 11,678 are 32.5% not 28.63%. c. 7,879 out of 11,678 are 67.47% not 59.37%. d. 831 out of 11,678 are 7.11% not 6.26%.

--------------------------------------------------------------------------------

3. Calculation in the Summary report2:

image

What did you expect to see?

Wrote inline

What version of fastdup were you runnning on?

1.36

What version of Python were you running on?

Python 3.10

Operating System

MacOS Venture 13.2.1

Reproduction steps

Wrote inline above

Relevant log output

added above

Attach a screenshot [Optional]

added above

Contact Details [Optional]

ohadedelstain@gmail.com

dbickson commented 10 months ago

Hi @ohade thanks for reporting, indeed confusing printouts. About the tar/zip file warning, we treat mp4 and avi as compressed files as we do frame extraction from them and save the extracted frames locally. Now imagine there are frames in the folders while having also frames inside the video with the same or similar names, the analysis of both raw video and extracted video frames my be mixed. it may result in unclear behavior so we work on either the split frames or the video but not on a mix of both. The warning is totally unclear we will try to improve it.

The counting printout is buggy we will fix. Some non images are getting into the count and messing it up. Maybe the videos. We will check.

ohade commented 10 months ago

Got it, thanks

dbickson commented 10 months ago

Now fixed in 1.38