jfilter / split-folders

🗂 Split folders with files (i.e. images) into training, validation and test (dataset) folders
MIT License
412 stars 72 forks source link

Split folders to work without a "class" hierarchy #22

Closed r02b closed 2 years ago

r02b commented 3 years ago

Since splitting data into (test, train, validation) sets is relevant to all data types, not just ones that are related different classes, having the option to use split-folder on a general folder, i.e. one that contains actual data and does not comply with the subdir ('class1', 'class2',...) hierarchy, would make this package relevant to a much larger crowd.

jfilter commented 2 years ago

If you want to split files in single folder, do the following:

  1. create an input folder, e.g., input
  2. create another folder, e.g., dummy within input and place all files in dummy.
  3. now split the folder like this: splitfolders --ratio .8 .1 .1 -- input

If you have an array of data, take a look at the following function from scikit-learn: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

PatrickKudo commented 1 year ago

I don't think creating dummy folders is an effective solution.