tensorfreitas / Siamese-Networks-for-One-Shot-Learning

Implementation of Siamese Neural Networks for One-shot Image Recognition
605 stars 181 forks source link

Dataset has same character in different labels (in one language) #22

Closed Tastror closed 1 year ago

Tastror commented 1 year ago

the character 07 08 09 in Omniglot Dataset/images_background/Ojibwe_(Canadian_Aboriginal_Syllabics) are same character (which means n in Ojibwe)

character07

image

character08

image

character09

image

tensorfreitas commented 1 year ago

Hi, the Omniglot dataset was directly taken from its original repository:

https://github.com/brendenlake/omniglot

You can also see it here that is also present in the Original: https://knowyourdata-tfds.withgoogle.com/#dataset=omniglot&tab=RELATIONS&relations=default_segment.omniglot.alphabet.value,default_segment.omniglot.label.value&relations_selected=,Ojibwe_(Canadian_Aboriginal_Syllabics)_6&group_by=default_segment.omniglot.alphabet.value&select=default_segment.omniglot.label.value,default_segment.omniglot.alphabet.value&sort_asc=true

I am not sure if it is an error on the dataset or these letters are just too similar in this alphabet. Thanks for spotting it

Tastror commented 1 year ago

Thanks for the source, I'll check for the original dataset.