hurlbertlab / dietdatabase

Creative Commons Zero v1.0 Universal
10 stars 9 forks source link

cleaning: specific questions about proper text fields #42

Closed pwinner1 closed 7 years ago

pwinner1 commented 7 years ago


-Names/Values with parentheses seem to not respond to the gsub function

-Should adjective + state be placed in Location_Region or Location_Specific? Example: 'Central Illinois', 'Northwestern North Dakota'


-Unclear about Habitat_type: 'Edge of forests' , 'open groves' , 'orchard' , 'mountains' , 'canyon' , 'Tropical' , 'valleys' , 'lowland' , 'upland' , 'inland' , 'tundra' ,


-What should studies that encompass all seasons be classified as? 'All' or 'All year' or 'Year Round' are some entries currently, or should all seasons just be listed?


-Should we get rid of "etc" from the text field?


-The terms page says "Options include: emetic, fecal examination, stomach contents, behavioral observation, nest observation." How standardized are these supposed to be? Are these the only options? If so, what are: -studies dealing with pellet contents/analysis? -nest debris? -gizzard, esophagi, gullet, or crop contents? -remains analysis?

-If a study only says 'Observation', should I go look at the study to see if it was nest or behavioral observation?

jhpoelen commented 7 years ago

(warning: outsider "for what it is worth" perspective from in addition to use human readable locales (e.g. Alaska), did you ever consider adding controlled terms from (e.g. If this is added, the interaction records can be more easily linked to other datasets (think: wikipedia) and ... you don't have to re-invent your own gazetteer.

jhpoelen commented 7 years ago

Perhaps the geolocate project will provide some inspiration.

ahhurlbert commented 7 years ago

original issues all dealt with; pushing controlled geoname terms to a later date