Public-Tree-Map / public-tree-map-data-pipeline

Scripts to process open datasets for Public Tree Map (companion repository: https://github.com/Public-Tree-Map/public-tree-map). Work in progress:
https://publictreemap.org/
MIT License
8 stars 7 forks source link

Add species synonyms #125

Closed emillipede closed 4 years ago

emillipede commented 4 years ago

for new changes, please submit all pull requests to the test-circleci branch

addresses #95

Motivation and context

SM's current tree inventory dataset uses some outdated species names. SM's arborist recognizes Cal Poly's selectree as the authoritative list of current plant botanical names. many cities in LA county also use a variety of names to refer to a common set of tree species. we want publictreemap to display the most accurate, up to date species names

What I did

to update our csv, I've used to selectree as my reference. where selectree documents a different name, I've used the selectree species name. in all of those cases, I retain the SM species name in the field sm_botanical_name. where selectree indicates additional species synonyms, I've documented them in the field botanical_synonyms separated by semicolons

the SM dataset also separates members of a single recognized species into two species with unique species IDs. the city dataset includes trees identified as:

in these two cases, I'm not sure how to resolve the issue. for now, I've left the four rows in tact and listed the selectree-specified species name in the botanical_name field. I don't think this approach is a long term solution. I'd like to incorporate a synonym check into the parser for data from other LA county cities (see #114 and #123)

emillipede commented 4 years ago

@cajaks2 recommended I: