Open cmungall opened 5 years ago
IEDB uses GAZ to build a list of countries that we care about and organize them under a shallow hierarchy. The code is here: https://github.com/IEDB/GAZ
We're using GAZ in a few ontology projects: As a standard vocabulary for 1st order, 2nd order etc government: countries, provinces, states, territories and regions and municipalities. This is used in our Genomic Epidemiology Ontology (GenEpiO) to provide pick-lists for reporting outbreak cases; to describe patient nationalities, country of birth, and travel patterns, and eventually food traceability. In FoodOn, we have reused a 'hasCountryOfOrigin' annotation pointing to country, but would look to GAZ to describe region of origin as well.
We developed a browser-based lookup function to navigate 'located in' hierarchies, but have discovered that the connection between 2nd/3rd order govt and municipalities is pretty spotty. Here's an example of a control that provides both 2nd and 3rd order branches separately. It comes with a lookup function so that a given application ontology doesn't have to list all of GAZ. Instead a user can select an app-provided GAZ entry, then press "lookup choices" to get further sub-class or 'located in' related items.
I should also mention that GAZ is being proposed (with geonames as alternative) for locations related to Biosample collection in an ISO "TC 34/SC 9 working group 25 repository for draft ontology-driven specifications referenced in the draft working document ISO/TC 34/SC 9 N 000 "Microbiology of the Food Chain — Whole Genome Sequencing, Typing and Genomic Characterization of Foodborne Bacteria", visible at http://genepio.org/geem/form.html#GENEPIO:0002083
Pressing "lookup choices" fetches subordinate choices from OLS.
The final selection might not be contained in the local app ontology:
Having GAZ be refreshed in tandem with wikidata or geonames edits would be wonderful.
A quick peek at municipality list:
Action Item: Chris Mungall, -- will send email to the OBO Foundry.
Action Item: Lynn - will examine BioSample - usage of GAZ
@lschriml any update on details on whether/how GAZ is used in BioSample (EBI or NCBI)
Yes, GAZ has been used in BioSample since it was created. GAZ is a core term in the MIxS standard. County field, geographic location. Also used in QIIME, QITTA, MGRAST, etc -- across GSC-associated projects
On Wed, Jun 26, 2019 at 11:57 AM Chris Mungall notifications@github.com wrote:
@lschriml https://github.com/lschriml any update on details on whether/how GAZ is used in BioSample (EBI or NCBI)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EnvironmentOntology/gaz/issues/22?email_source=notifications&email_token=ABBB4DPQUNSF4WPMRWL4Z4TP4OGV3A5CNFSM4HJEZR3KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYT737Y#issuecomment-505937407, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBB4DIIHYF3ITJ66XLPZ6TP4OGV3ANCNFSM4HJEZR3A .
-- Lynn M. Schriml, Ph.D. Associate Professor
Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 lschriml@som.umaryland.edu
Working with @turbomam @wdduncan we now have a normalized tidied version of the INSDC sample database, so we can do some analysis of usage of GAZ with mixs:geo_loc_name
77159 distinct geo_loc_name values
GAZ IDs have been used 8 times in total in the whole database:
$ grep GAZ target/distinct-geo_loc_name.tsv | egrep 'GAZ:\d'
100 GAZ:00116363
53 GAZ:00116380
42 Azerbaijan:Caspian Sea (GAZ:00008076)
7 GAZ:00322747
4 GAZ:00315575
4 GAZ:00313293
4 GAZ:00322749
4 GAZ:00322744
the vast majority of usages are simple strings, so it's not clear GAZ is not being meaningfully used or adds value beyond other gazetteers
Top uses:
513500 USA
214152 missing
163139 not applicable
123076 United Kingdom
119337 China
74845 United Kingdom: United Kingdom
53333 Germany
47282 not collected
45201 Canada
44491 Australia
38338 Denmark
34607 NA
32928 USA: GAZ
31881 Netherlands
31818 Spain
31669 France
29070 Japan
28866 Sweden
27420 Finland
25698 Italy
23134 USA: California
22048 Switzerland
18143 USA:New York
18106 Brazil
16387 USA:CA:San Diego
16376 China: Beijing
15780 India
15663 USA:CA
15063 Pacific Ocean
14737 South Africa
14626 China:Beijing
12208 USA:Boston
12122 Norway
12008 Israel
11742 Chile
11725 South Korea
11333 Denmark: Copenhagen
11124 USA: Michigan
11086 Kenya
10982 Mexico
10789 USA: Oregon
10560 Malawi
10506 Singapore
10337 China:Hangzhou
10336 USA: Massachusetts
9102 USA: New York
9051 USA:MD
8808 China:Shanghai
8684 USA: Minnesota
8663 Austria
8602 Bangladesh
8532 USA:TX
8413 Russia
8396 Ireland
8396 Atlantic Ocean
8188 USA:NY
8097 Uganda
7971 New Zealand
7832 USA:GA
7765 China: Shanghai
7657 USA: Texas
7410 Belgium
7334 USA:NC
6996 Czech Republic
6934 USA:MN
6678 Not applicable
6638 Hong Kong
6538 United Kingdom: Oxford
6506 Missing
6503 Thailand
6336 USA: North Carolina
6121 China:Nanjing
6075 USA: Florida
6018 USA:PA
5961 Netherlands: western part
5807 N/A
5799 USA:CO:Boulder
5740 USA:WA
5670 USA:California
5550 Poland
5458 not provided
5374 Tanzania
5295 Australia: NSW
5271 Baltic Sea
5261 Portugal
5179 Canada: British Columbia
5160 USA:Michigan
5142 USA:IA
5055 United Kingdom: London
5005 Peru
4975 Taiwan
4906 Canada: Quebec
4836 Australia: Brisbane
4797 Canada: Saskatoon
4528 USA:South Fork Eel River, CA
4500 USA: Boston
4481 China: Hangzhou
4437 Canada: Ontario
4415 USA: Ohio
4397 USA:WI
Anyone using GAZ, please add comments to this ticket!
Also include whether you use the obo or owl file, if you download or use an API, etc. Note any specific things you would like to see at a general level (or link to a ticket)