AMI-system / gbif_download_standalone

A standalone repo to download images from the GBIF database according to a species list.
MIT License
0 stars 0 forks source link

Download code handover with Katriona #20

Closed LevanBokeria closed 1 year ago

LevanBokeria commented 1 year ago

Broad summary:

The codebase is an adaptation of Rolnick Lab's Species Trainer repo.

Brief summary of changes:

Known issues:

Costarica list:

Some species might have unclear taxonomy on GBIF. For example, "Nepheloleuca politia" and "Nepheloleuca illiturata" seem to be two different species, with difference "scientificName". However, in the occurrence dataframes, the "species" field for both of these are "Nepheloleuca politia".

Importantly, our download code (based on Rolnick lab), uses the "species" field to save images in appropriate folder hierarchy. Because of this, images from these two distinct species end up in the same folder, and the meta_data.json file no longer reflects the reality.

Solutions:

GBIF updates:

I discovered that for Sessiidae species Pyropteron chrysidiformis, my old dwca file from August 2023 contains working URL links to images, but the updated dwca file from Octorber 2023 has no working URLs. So was the data ammended or deleted? How will this impact our database? How to update metadata.json to reflect this, or perhaps we should not reflect this?

Improvements: