jfilter / split-folders

๐Ÿ—‚ Split folders with files (i.e. images) into training, validation and test (dataset) folders
MIT License
414 stars 72 forks source link

๐Ÿ“œ XML copy support #17

Closed asmaamirkhan closed 4 years ago

asmaamirkhan commented 4 years ago

Hello ๐Ÿ™‹โ€โ™€๏ธ,

def copy_files(files_type, class_dir, output, prog_bar, copy_xml=False):
    """Copies the files from the input folder to the output folder
    """
    # get the last part within the file
    class_name = path.split(class_dir)[1]
    for (files, folder_type) in files_type:
        full_path = path.join(output, folder_type, class_name)

        pathlib.Path(full_path).mkdir(parents=True, exist_ok=True)
        for f in files:
            if not prog_bar is None:
                prog_bar.update()
            extension = path.splitext(path.split(f)[-1])[-1].lower()
            if extension in [".jpg", ".png", ".bmp", "jpeg", "gif"]:
                shutil.copy2(f, full_path)
                if copy_xml:
                    xml_f = path.splitext(f)[0] + ".xml"
                    shutil.copy2(xml_f, full_path)
jfilter commented 4 years ago

Hey, it should work with .xml files right now. Was there any kind of error when you tried it?

asmaamirkhan commented 4 years ago

๐Ÿ‘ฉโ€๐Ÿ”ฌ My Data Sample

class_1
    |__ 526662229047636.jpg
    |__ 526662229047636.xml
    |__ 542621690732488.jpg
    |__ 542621690732488.xml
    |__ 558581152256050.jpg
    |__ 558581152256050.xml
    |__ 574567401623040.JPG
    |__ 574567401623040.xml
.
.
.
Other classes

โœ”๏ธ Expected Possible Output

__train
|  |__ class_1
|      |__ 542621690732488.jpg
|      |__ 542621690732488.xml
|      |__ 558581152256050.jpg
|      |__ 558581152256050.xml
|      |__ 574567401623040.JPG
|      |__ 574567401623040.xml
|__val
   |__ class_1
        |__ 526662229047636.jpg
        |__ 526662229047636.xml

๐Ÿ“– According to VOC format

๐Ÿงช Current Output

__train
|  |__ class_1
|      |__ 542621690732488.jpg
|      |__ 542621690732488.xml
|      |__ 558581152256050.jpg
|      |__ 574567401623040.JPG
|      |__ 574567401623040.xml
|__val
   |__ class_1
       |__ 526662229047636.jpg
       |__ 526662229047636.xml
       |__ 558581152256050.xml
jfilter commented 4 years ago

Ah, now I understand your problem. It's related to #14. I will provide an update to fix this soon-ish.

jfilter commented 4 years ago

Set group_prefix=2 to split in groups. It should work now. Please re-open this issue if not. Thanks.

asmaamirkhan commented 4 years ago
jfilter commented 4 years ago

You right, fixed in 0.4.1. Use group_prefix from now on.

dnth commented 3 years ago

Somehow group_prefix doesnt work on CLI but works on the python module. Anyone can verify this?