Open cmungall opened 7 years ago
OK, this is odd:
SAMPLE_ACC: ANTARCTICAAQUATIC_SMPL_SITE1
SAMPLE_DESCRIPTION:
DESCRIPTION:
SITE_DESCRIPTION: saltwater manmade
REGION: Arizona
HABITAT_NAME: saline water
biome_label: anthropogenic terrestrial biome
biome_id: ENVO:01000219
environmental_material_label: saline water
environmental_material_id: ENVO:00002010
environmental_feature_label: anthropogenic abiotic mesoscopic feature
environmental_feature_id: ENVO:00003075
Which I think corresponds to https://www.imicrobe.us/sample/view/44
totally confused... is this sample in Arizona, Australia or Antartica? Is it a manmade lake? Made by Australians in Antartica...?
@cmungall thank you for pointing this out! The developer who handles this aspect of the site is out until Monday. We'll sit down to look at it then.
Hi Chris,
Thanks Chris. Ken will check into this on Monday. Appreciate you pointing out!
Bonnie
Hi @bhurwitz33 - any luck tracking down what's going on here?
Ken,
Any ideas? Bonnie
Get Outlook for iOShttps://aka.ms/o0ukef
From: Chris Mungall notifications@github.com Sent: Wednesday, October 11, 2017 7:51 PM Subject: Re: [hurwitzlab/imicrobe-lib] CameraMetadata_ENVO_working_copy and iMicrobe site (#1) To: hurwitzlab/imicrobe-lib imicrobe-lib@noreply.github.com Cc: Bonnie Hurwitz bonnie.hurwitz@gmail.com, Mention mention@noreply.github.com
Hi @bhurwitz33https://github.com/bhurwitz33 - any luck tracking down what's going on here?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/hurwitzlab/imicrobe-lib/issues/1#issuecomment-336005729, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEe1FpaOratazdptSwCTxdbX_E7YKybbks5srX7JgaJpZM4OXsb-.
First off, I can honestly say that I have no recollection of how the ontology terms were created in the imicrobe db. I must not have used this file as the only source, because a check of all the records in the file listed at the beginning show that 451 samples out of the total 2813 have a discrepancy b/w what's in the db and what's in the file. I could easily make the db reflect just what is in the file, but I would like to get some confirmation that this is the right move. Maybe specifically Bonnie and Ramona could verify that I should do this?
The first example is "CAM_SMPL_002668" which is this record:
https://www.imicrobe.us/sample/view/1929
And in that instance, all the db ontology ids match exactly with those in the file:
1658: CAM_SMPL_002668 (1929): {
biome_id => "ENVO:01000030",
biome_label => "marine hydrothermal vent biome",
description => "",
environmental_feature_id => "ENVO:01000122",
environmental_feature_label => "marine hydrothermal vent",
environmental_material_id => "ENVO:00002006",
environmental_material_label => "water",
habitat_name => "hydrothermal vent",
region => "Eastern Pacific Ocean",
sample_acc => "CAM_SMPL_002668",
sample_description => "ALVINELLA - Alvinella Pompejana Epibionts",
site_description => "Hydrothermal Vent",
}
DB = ENVO:00002006, ENVO:01000030, ENVO:01000122
File = ENVO:00002006, ENVO:01000030, ENVO:01000122
As for the ANTARCTICAAQUATIC_SMPL_SITE1 sample (https://www.imicrobe.us/sample/view/44), I don't know why the file says "Arizona" when it's clearly in the Antarctic. The fact that the "Region: Arizona" didn't make it into the imicrobe db (and hence I can't find it in the web display) is a weird bonus-bug of some sort? Like, it's a good thing I missed importing that? If I "grep Arizona CameraMetadata_ENVO_working_copy.csv" then I get 23 hits, one for "ALVINELLA_SMPL_20041130" and the rest for "ANTARCTICAAQUATIC_SMPL_SITE*." How very weird.
Here is a basic overview:
Ontology development for CAMERA metadata. As part of the iMicrobe project, we developed a new ontology called: Microbial Environments described using OWL (MEOWL) ontology. The first step toward ontologizing the CAMERA data was to clean up and organize existing data. To do this, we mapped all CAMERA metadata labels to the Minimum Information for any (x) Genome (MIxS) vocabulary, which both standardized and reduced the number of terms. To go from a controlled vocabulary to an ontology, we categorized existing terms into a hierarchy based on classes such as environmental parameter, chemical parameter, location, and habitat. Where possible, classes were mapped to the existing BCO-DMO vocabulary to obtain textual definitions. The ME-OWL ontology is currently available at XXX and is used on the iMicrobe data site to streamline the metadata search interface (http://data.imicrobe.us/sample/search). Specifically, users can combine multiple search parameters (e.g., salinity greater/less than/between two values, Longhurst province including several regions, sample depth) to find samples, view them on a map, and download associated files. As such, datasets are discoverable and available for re-use.
Ramona – can you fill in where the github repo is for ME-OWL?
Bonnie
From: Ken Youens-Clark notifications@github.com Reply-To: hurwitzlab/imicrobe-lib reply@reply.github.com Date: Thursday, October 12, 2017 at 3:57 PM To: hurwitzlab/imicrobe-lib imicrobe-lib@noreply.github.com Cc: Bonnie Hurwitz bonnie.hurwitz@gmail.com, Mention mention@noreply.github.com Subject: Re: [hurwitzlab/imicrobe-lib] CameraMetadata_ENVO_working_copy and iMicrobe site (#1)
First off, I can honestly say that I have no recollection of how the ontology terms were created in the imicrobe db. I must not have used this file as the only source, because a check of all the records in the file listed at the beginning show that 451 samples out of the total 2813 have a discrepancy b/w what's in the db and what's in the file. I could easily make the db reflect just what is in the file, but I would like to get some confirmation that this is the right move. Maybe specifically Bonnie and Ramona could verify that I should do this?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Hi all. The MEOWL repo is at https://github.com/hurwitzlab/meowl. I am looking into the discrepancies now. I have to give a talk on MEOWL next week, so I will be diving into it a bit this week.
I found the problem. On commit 2e672bcec7c89dd150f55b22c3afd14749bc181f, column A stayed the same while the rest of the file was sorted on column B. I'm fixing it now. Thank you for noticing this, @cmungall !
Thanks Ramona! -Bonnie
From: Ramona Walls notifications@github.com Reply-To: hurwitzlab/imicrobe-lib reply@reply.github.com Date: Wednesday, October 18, 2017 at 3:32 PM To: hurwitzlab/imicrobe-lib imicrobe-lib@noreply.github.com Cc: Bonnie Hurwitz bonnie.hurwitz@gmail.com, Mention mention@noreply.github.com Subject: Re: [hurwitzlab/imicrobe-lib] CameraMetadata_ENVO_working_copy and iMicrobe site (#1)
I found the problem. On commit 2e672bc, column A stayed the same while the rest of the file was sorted on column B. I'm fixing it now. Thank you for noticing this, @cmungall !
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Thanks @ramonawalls!
Have these changes propagated to the site?
For example https://www.imicrobe.us/#/samples/44
still says Australia
I note that the new site doesn't have an ontology tab anymore which is a shame
It looks like the files are richer. For example, this row:
SAMPLE_ACC: ALVINELLA_SMPL_20041130
SAMPLE_DESCRIPTION: ALVINELLA - Alvinella Pompejana Epibionts
DESCRIPTION:
SITE_DESCRIPTION: Hydrothermal Vent
REGION: Eastern Pacific Ocean
HABITAT_NAME: hydrothermal vent
biome_label: marine hydrothermal vent biome
biome_id: ENVO:01000030
environmental_material_label: water
environmental_material_id: ENVO:00002006
environmental_feature_label: marine hydrothermal vent
environmental_feature_id: ENVO:01000122
corresponds to https://www.imicrobe.us/#/samples/44
which is missing most of the above, and says the biome is "Polar Biome (ENVO_01000339)". I think both the csv and the site are correct, it's both a polar and marine hydrothermal vent biome, but the discrepancy is still puzzling
I'm seeing inconsistencies between https://raw.githubusercontent.com/hurwitzlab/imicrobe-lib/master/docs/mapping_files/CameraMetadata_ENVO_working_copy.csv and the site. E.g.
this makes sense, however what is presumably the same record:
https://www.imicrobe.us/sample/view/3
has some odd annotations:
There are some annotations on the site that are not in the file, e.g.
https://www.imicrobe.us/sample/view/1
I think the file is correct but there is a bug in the site, not sure how to report this.