gaurav / bettertaxonomy

A script for matching against multiple taxonomic sources
1 stars 1 forks source link

Write as many fields into the internal database as possible #13

Open gaurav opened 9 years ago

gaurav commented 9 years ago

Not sure how this would work: do we add every row, or do we summarize different values with one row per species name? In either case, this would help figure out whether a record refers to, say, a bird vs a beetle.

tucotuco commented 9 years ago

The way I do this now is:

If I do not have the scientificName (Genus value in my case right now) in the VertNet Classification, I add a record with that scientificName (Genus) as the key. I populate the record with every Darwin Core rank that I am givien in the original and the record is set to unchecked by default. At the classification resolution step, I look for all records that are unchecked and standardize all of the ranks, even if they differ from the original. The original being there gives me the needed context to distinguish the correct standard name to supply.