plantnet / PlantNet-300K

[NeurIPS2021] A plant image dataset with high label ambiguity and a long-tailed distribution
https://doi.org/10.5281/zenodo.5645731
BSD 2-Clause "Simplified" License
156 stars 28 forks source link

data annotation question #6

Closed beautifulchoi closed 1 year ago

beautifulchoi commented 2 years ago

Hello! while using your dataset served in the homepage, I have some question. annotation label dictionary is in json file, so I check it and some of plant species are overlapped but have different labels.

so I wonder if it's error or they are actually different ones but just same species ??

(I check 57 species are overlapped. for example, Schefflera_actinophylla have 3 same species but different label number)

garcinc commented 2 years ago

Hi !

Did you download the last version in zenodo ? (version 1.1) or the old one ? (version 1.0)

beautifulchoi commented 2 years ago

Thanks for reply :) I think we download old version(1.0) in zenodo, so we have to change the dataset. but I still wonder there's not in this issue(data overlapped problem)

garcinc commented 2 years ago

If you download the new json file all classes will have different species names.

Best

beautifulchoi commented 2 years ago

Thanks for reply again : ) I downloaded the new version of dataset and check the annotation, there was no overlapped one, so total species were 1081. but when I checked in detail, I found some problem that some species' names are some but just different string like have one more spacing. (for example: Nephrolepis cordifolia (L.) C. Presl, Nephrolepis cordifolia (L.) C.Presl)

these are the list I found

Lactuca virosa L. Lactuca virosa Habl. Pelargonium peltatum (L.) L'Hér. Pelargonium peltatum (L.) Aiton Pelargonium zonale (L.) L'Hér. Pelargonium zonale (L.) L'Hér. ex Aiton Tradescantia zebrina Heynh. ex Bosse Tradescantia zebrina Bosse Tradescantia zebrina hort. ex Bosse Asystasia gangetica (L.) T. Anderson Asystasia gangetica (L.) T.Anderson Nymphaea nouchali Burm. f. Nymphaea nouchali Burm.f. Nephrolepis cordifolia (L.) C. Presl Nephrolepis cordifolia (L.) C.Presl Alliaria petiolata (M. Bieb.) Cavara & Grande Alliaria petiolata (M.Bieb.) Cavara & Grande Lavandula canariensis Mill. Lavandula canariensis (L.) Mill. Pancratium maritimum L. Pancratium maritimum L. L. Cirsium palustre (L.) Scop. Cirsium palustre (L.) Coss. ex Scop. Freesia refracta (Jacq.) Klatt Freesia refracta (Jacq.) Eckl. ex Klatt Tradescantia pallida (Rose) D.R. Hunt Tradescantia pallida (Rose) D.R.Hunt Schefflera actinophylla (Endl.) Harms Schefflera actinophylla Harms Schefflera actinophylla (F.Muell.) Harms Melilotus officinalis (L.) Pall. Melilotus officinalis (L.) Lam. Duchesnea indica (Jacks.) Focke Duchesnea indica (Andrews) Teschem. Guizotia abyssinica (L. f.) Cass. Guizotia abyssinica (L.f.) Cass. Calendula arvensis (Vaill.) L. Calendula arvensis L. Calendula arvensis M.Bieb. Gynura aurantiaca (Blume) DC. Gynura aurantiaca (Blume) Sch.Bip. ex DC. Schkuhria pinnata (Lam.) Thell. Schkuhria pinnata (Lam.) Kuntze ex Thell. Acalypha hispida Burm. f. Acalypha hispida Burm.f. Mussaenda philippica A. Rich. Mussaenda philippica A.Rich. Vanilla planifolia Andrews Vanilla planifolia Jacks. Vanilla planifolia Jacks. ex Andrews Acacia saligna (Labill.) H.L.Wendl. Acacia saligna (Labill.) Wendl. Dryopteris carthusiana (Vill.) H.P.Fuchs Dryopteris carthusiana (Vill.) H.P. Fuchs Dryopteris cristata (L.) A.Gray Dryopteris cristata (L.) A. Gray Hebe andersonii (Lindl. & Paxton) Cockayne Hebe andersonii (Lindl. & J. Paxton) Cockayne Hypericum x inodorum Mill. Hypericum x hidcoteense Hilling ex Geerinck Kniphofia uvaria (L.) Hook. Kniphofia uvaria (L.) Oken Lactuca alpina (L.) Benth. & Hook.f. Lactuca alpina (L.) A.Gray Lamium maculatum (L.) L. Lamium maculatum L. Limnanthes douglasii R.Br. Limnanthes douglasii R. Br. Pelargonium x hybridum (L.) Aiton Pelargonium x asperum Ehrh. ex Willd. Pelargonium x hortorum L.H. Bailey Sasa palmata (Burb.) Camus Sasa palmata (Burb.) E.G.Camus Pyracantha coccinea M.Roem. Pyracantha coccinea M. Roem. Sedum kamtschaticum Fisch. & C.A.Mey. Sedum kamtschaticum Fisch. Metasequoia glyptostroboides Hu & W.C.Cheng Metasequoia glyptostroboides Hu & W.C. Cheng Cymbalaria muralis P.Gaertn., B.Mey. & Scherb. Cymbalaria muralis P.Gaertn. Fragaria virginiana Duchesne Fragaria virginiana Mill. Sedum palmeri S.Watson Sedum palmeri S. Watson Dryopteris erythrosora (Eaton) Kuntze Dryopteris erythrosora (D.C. Eaton) Kuntze Lupinus nootkatensis Donn ex Sims Lupinus nootkatensis Sims Barbarea vulgaris R.Br. Barbarea vulgaris W.T. Aiton Peperomia obtusifolia (L.) A.Dietr. Peperomia obtusifolia (L.) A. Dietr. Alocasia cucullata (Lour.) G.Don Alocasia cucullata (Lour.) G. Don Alocasia macrorrhizos (L.) G.Don Alocasia macrorrhizos (L.) G. Don Fragaria x ananassa Duchesne ex Rozier Fragaria x ananassa (Duchesne ex Weston) Duchesne ex Rozier Acacia auriculiformis A.Cunn. ex Benth. Acacia auriculiformis Benth. Acacia podalyriifolia A.Cunn. ex G.Don Acacia podalyriifolia G.Don Peperomia argyreia (Miq.) E.Morren Peperomia argyreia (Hook.f.) E.Morren Zamia furfuracea L.f. Zamia furfuracea L.f. ex Aiton Fittonia albivenis (Lindl. ex Veitch) R.K. Brummitt Fittonia albivenis (Lindl. ex Veitch) Brummitt Selenicereus anthonyanus (Alexander) D.Hunt Selenicereus anthonyanus (Alexander) D.R. Hunt Liriope muscari (Decne.) L.H.Bailey Liriope muscari (Decne.) L.H. Bailey Mazus pumilus (Burm.f.) Steenis Mazus pumilus (Burm. f.) Steenis Melampodium divaricatum (Rich.) DC. Melampodium divaricatum (Rich. ex Rich.) DC. Tagetes lemmonii A. Gray Tagetes lemmonii A.Gray

alexisjoly commented 2 years ago

Hello,

the list of species managed in Pl@ntNet is an aggregation of official taxonomic checklists from several countries/regions of the world (using the "International Code of Nomenclature for algae, fungi, and plants"). As we are not the managers of these checklists, it is not our role to correct synonymy problems between these different taxonomies. Even if some duplicates seem very likely, we do not have the guarantee that all the tuples in your list are really the same species. In the Pl@ntNet application, we therefore treat them as different classes. There is then a post-filtering by checklist which allows to return only the correct species name.

In Pl@ntNet-300K, we wanted to be as close as possible from the real-world problem. We consequently decided to consider that the presence of these (near)-duplicate classes is part of the problem. The ML algorithm should learn the ambiguity between those classes by himself.

Kind regards, Alexis Joly