Open guberti opened 2 years ago
Any update on this a year later? It's not clear how to proceed with unbalanced binary classification via TensorFlow datasets per the official tutorial, if Keras fails to understand there are no sub-directories when the labels are explicitly provided.
Describe the problem.
The docs for image_dataset_from_directory say the following about the
directory
argument:This means that when
labels
is a list/tuple, we should ignore the directory structure (this makes sense, as the directory structure would only be used to generate labels).Describe the current behavior.
However, this is not what happens - instead, see the following code snippet from
dataset_utils.py
:We only ignore the subdirectory structure if
labels is None
, instead of whenlabels != 'inferred'
. This means that whenlabels
is a list/tuple, we expect a subdirectory structure (when none exists), causingimage_dataset_from_directory
to fail in this case.Describe the expected behavior.
We should ignore the subdirectory structure if
labels
is anything other thaninferred
(i.e. make the code match what the documentation says should happen). This should be a one-line change, and I'd be happy to make a PR.However, the existence of this issue suggests the use case where
labels
is a list/tuple is not unit tested, so it would probably be good to write a test. Would love a suggestion from someone more familiar with the codebase about how best to do this.