fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
input_dir supports a python list of filenames, thus you can do the sampling in whatever policy you like and then send the list as an argument to fastdup that will work on those files.
Feature Name
Num_images ordering
Feature Description
Num_images likely returns the firs N images in the dataset, but nowhere in the documentation is this detailed. Two possible directions to improve:
Simple: specify that num_images returns the first N examples
Less simple: allow for selection patterns such as random sampling
Contact Information [Optional]
No response