Particle1904 / DatasetHelpers

Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and text) for Machine Learning training.
MIT License
170 stars 9 forks source link

Feature Request: Extract Subset move/remove subset #5

Closed histpay2 closed 8 months ago

histpay2 commented 11 months ago

I would like to request the ability to move/remove files from the input folder of the Extract Subset function. That way I can choose to separate certain image subsets from the input folder. Lets say I wanted to separate monochrome images from the rest of them, the current state of the tool only allows me to copy that subset of images filtered by that tag to an output folder.

Thank you for your hard work!

Particle1904 commented 8 months ago

Can you clarify on this? I tried my absolute best to not do ANY destructive operations. That's the reason why you always end up with a copy of the images when doing pretty much any kind of processing. Extract subset one for example will make a COPY of the filtered images to a different folder while keeping the original intact. Same for all other features, it always makes a copy even when processing the data like resizing or cropping.

As far as I remember (I haven't worked in the tools for 4 months) you can do some filtering in the extract subset page like: "monochrome AND grayscale, !dog" to copy all images that have BOTH "monochrome" and "grayscale" but NOT "dog".

The reason why the tools aren't doing ANY type of destructive operations is because I personally lost 200 images from a dataset with 570 images in the early stages of the tools. And I don't want anyone else to experience that... I lost days of work.

I just don't feel comfortable at all with deleting or removing files for the user.

histpay2 commented 8 months ago

Can you clarify on this? I tried my absolute best to not do ANY destructive operations. That's the reason why you always end up with a copy of the images when doing pretty much any kind of processing. Extract subset one for example will make a COPY of the filtered images to a different folder while keeping the original intact. Same for all other features, it always makes a copy even when processing the data like resizing or cropping.

As far as I remember (I haven't worked in the tools for 4 months) you can do some filtering in the extract subset page like: "monochrome AND grayscale, !dog" to copy all images that have BOTH "monochrome" and "grayscale" but NOT "dog".

The reason why the tools aren't doing ANY type of destructive operations is because I personally lost 200 images from a dataset with 570 images in the early stages of the tools. And I don't want anyone else to experience that... I lost days of work.

I just don't feel comfortable at all with deleting or removing files for the user.

I think that's understandable. Mainly I wanted to exclude particular tags from having to be handled as I needed to go and manually remove many images (I've probably deleted hundreds of thousands of images by now). I've been using the tool with some success to tag images and remove styles I didn't like by cutting the extracted images and pasting them into the source folder, then replacing the duplicate file names, and since they're automatically selected I can easily move them to the trash.

With that said, just being able to move extracted images would save some steps for handling things in bulk, but I understand if you don't want to cause any issues.