hurlbertlab / dietdatabase

Creative Commons Zero v1.0 Universal
10 stars 9 forks source link

cleaning: specific questions about proper text fields #42

Closed pwinner1 closed 7 years ago

pwinner1 commented 7 years ago

Location_Specific

-Names/Values with parentheses seem to not respond to the gsub function

-Should adjective + state be placed in Location_Region or Location_Specific? Example: 'Central Illinois', 'Northwestern North Dakota'

Habitat_Type

-Unclear about Habitat_type: 'Edge of forests' , 'open groves' , 'orchard' , 'mountains' , 'canyon' , 'Tropical' , 'valleys' , 'lowland' , 'upland' , 'inland' , 'tundra' ,

Observation_Season

-What should studies that encompass all seasons be classified as? 'All' or 'All year' or 'Year Round' are some entries currently, or should all seasons just be listed?

Prey_Part

-Should we get rid of "etc" from the text field?

Study_Type

-The terms page says "Options include: emetic, fecal examination, stomach contents, behavioral observation, nest observation." How standardized are these supposed to be? Are these the only options? If so, what are: -studies dealing with pellet contents/analysis? -nest debris? -gizzard, esophagi, gullet, or crop contents? -remains analysis?

-If a study only says 'Observation', should I go look at the study to see if it was nest or behavioral observation?

jhpoelen commented 7 years ago

(warning: outsider "for what it is worth" perspective from http://globalbioticinteractions.org) in addition to use human readable locales (e.g. Alaska), did you ever consider adding controlled terms from http://geonames.org (e.g. http://www.geonames.org/5879092)? If this is added, the interaction records can be more easily linked to other datasets (think: wikipedia) and ... you don't have to re-invent your own gazetteer.

jhpoelen commented 7 years ago

Perhaps the geolocate project http://www.museum.tulane.edu/geolocate/ will provide some inspiration.

ahhurlbert commented 7 years ago

original issues all dealt with; pushing controlled geoname terms to a later date