NRCan / geo-deep-learning

Deep learning applied to georeferenced datasets
https://geo-deep-learning.readthedocs.io/en/latest/
MIT License
149 stars 49 forks source link

Resolve naming convention for duplicate image filename #516

Closed valhassan closed 7 months ago

valhassan commented 1 year ago

During the process of tiling an edge case can arise when two images in a CSV are identical thereby causing errors. There may be reasons why identical image filenames exist.

  1. Images can be duplicated for augmentation hence creating an edge case of a duplicate filename
  2. Image directories are different but the image filenames are the same.

With this edge case, overwrites of preexisting patch files occur while an error arises when moving patch files between train and validation folders 'file not found error'

A simple fix using the image index to create unique patch filenames is implemented here: https://github.com/NRCan/geo-deep-learning/commit/4f92b668c2f7be92b17ec27181947b28560689a2