Closed gregcaporaso closed 7 years ago
How's something like this? The different colors represent different environments. Red = Animal-Associated Green = Sea Water Blue = Fresh Water Orange = Soil Black = other
This is cool Dan.
Would it be possible to add something like a mouseover to see what projects/samples contribute to what bars in the bar chart?
Also, it's a bit of a problem that some of the bars are obscuring other bars. Would having the different environments side-by-side work better?
Mouseovers would be cool - I'd need to rework my script quite a bit though to track projects/samples.
The bars are stacked, so there's no need to worry about visual obstruction :)
I should also mention that although the combined height of each bar is plotted on a log scale, the color-by-color breakdown is simple percentage.
OK, that makes sense, thanks! What do others think? Is mouseover to show project/sample ids in Dan's plot worth the development effort on this part?
Also, Dan, could you modify so the figure titles are not abbreviated (e.g., 'tot org carb' becomes 'Total Organic Carbon')?
I've updated the figure to display full title descriptions as you suggested - you may need to click on the above image and hit refresh to see the changes. Do you happen to know what the units of measurement are for the values in EMP data, e.g. um vs nm?
I've also put together this graphic showing the geographical distribution of EMP samples:
Cool, this looks better. If @gilbertjack agrees that this is what he's looking for I think we can close the issue.
As for the geographic distribution, this overlaps with issue #2 so @dansmith01 and @douginator2000 should connect about this.
I am very happy with this.
Hey Dan,
Can you remove the gradient bars between the major divisions - i.e. just keep the gradient bars for 10, 100, 1000, 10,000, etc.
Cheers
Jack
Sure thing! I've updated the figure. You can also view it at this link: http://img.dnasmith.com/histograms_static1.png
I left the tick marks though to help key in the viewer that it's on a log scale. If you'd like them removed as well, just let me know.
I'm thinking it'd be cool to add a fourth color for human-associated samples. What do you think?
sure sounds good, and yes i am fine with the marks
Ok, updated. And some of them now have a log scale on the x-axis.
@dansmith01, this looks much better! Thanks!
Two last issues:
Once this is done I think we're ready to commit these files to the data repository and close this issue.
Here's a PDF of the histograms: http://img.dnasmith.com/histograms.pdf I'll put together a legend shortly.
@dansmith01 - what was the source of the data in this plot? I'm realizing that we still don't have the full mapping file together, so I'm just wondering if this is comprehensive.
@dansmith01 - just wanted to check on the source of this data. We'll need the plot generated from the latest mapping file (see issue #24) which we're hoping will be ready tonight. Sorry, I hope that's not too much extra effort!
@gregcaporaso - I'm using the metadata files downloaded from the EMP GESD.
OK, there is going to be a new "official" metadata file coming soon (issue
be the same format as the ones you're downloading.
Greg
On Fri, Aug 10, 2012 at 1:06 PM, dansmith01 notifications@github.comwrote:
@gregcaporaso https://github.com/gregcaporaso - I'm using the metadata files downloaded from the EMP GESD.
— Reply to this email directly or view it on GitHubhttps://github.com/EarthMicrobiomeProject/isme14/issues/13#issuecomment-7655210.
Yep - I think that'll be simple enough.
I've got the legend for this figure ready to go: http://img.dnasmith.com/histograms-legend.pdf
Perfect, thanks!
@gregcaporaso - The official metadata file has 6,541 samples, whereas the one I compiled from GESD has 14,176 samples. For example, the GESD dataset named "sample_template_2012-06-14 13_54_16.486850" is missing. Do you know why so many samples were excluded from the official compilation, and would you still like me to regenerate the above histograms using the reduced dataset?
The official metadata file contains only the samples that were sequenced and then subsequently processed and loaded into the QIIME-DB. The EMP portal contains all samples including those samples which haven't been sequenced yet, which is why there is a large discrepancy in the numbers.
For this analysis I think we want to go with what has been sequenced already as that's what we're including for the other analyses. @gilbertjack and @rob-knight, do you agree?
Yes
yes, definitely
On Aug 13, 2012, at 6:05 PM, gilbertjack notifications@github.com<mailto:notifications@github.com> wrote:
Yes
— Reply to this email directly or view it on GitHubhttps://github.com/EarthMicrobiomeProject/isme14/issues/13#issuecomment-7712456.
@dansmith01, let me know if you need anything else to get this done.
@gregcaporaso, could you take a quick look at the master mapping file (issue #24)? The last 104 lines don't seems to mesh with the columns in the lines above.
@dansmith01 are you sure your dropbox in sync'ing? That was an issue with an older version, but I fixed that a couple of days ago. The version here:
https://github.com/EarthMicrobiomeProject/isme14/blob/master/master_mapping_file.txt.gz?raw=true
also has the fix.
Not sure about dropbox, but that link works great! Thanks
It might be an issue with dropbox- Greg and I ran into the same issue where my shared dropbox folder wasn't updating a couple of days ago.
On Wed, Aug 15, 2012 at 12:46 PM, dansmith01 notifications@github.comwrote:
Not sure about dropbox, but that link works great! Thanks
— Reply to this email directly or view it on GitHubhttps://github.com/EarthMicrobiomeProject/isme14/issues/13#issuecomment-7766996.
Here are the PowerPoint Slides: https://www.dropbox.com/s/891ggfz00b1ddgh/Coverage%20of%20Environmental%20Parameters.pptx
Hello all,
I made a mock-up demonstration along a similar line as this thread for my first meeting with Dr. Jansson, you can see it here. https://www.dropbox.com/s/wd68fpdlakw83c2/AThomas_EMP_DemoAnalysis.pptx
Jack Gilbert liked the maps and wanted to use them. However, I am unsure as to the quality of the data I used. I through this together by copying the Lat/Long coordinates out of the map here http://www.microbio.me/emp/
I'd like to make myself useful, so let me know what you think. I have some questions and concerns about the metadata I added to https://github.com/EarthMicrobiomeProject/isme14/issues/17#issuecomment-17695847
Thanks, Alex
@rob-knight suggested redoing graphs of environmental parameters for EMP 20k analysis.
From Jack:
I would also like to have a visual representation of the environmental gradients we have for each ecosystem. i.e. I can imagine a figure like the attached (sorry in my hotel room) - where we represent from a gradient of 0-100 the coverage of the gradients we have already surveyed. 0 would be the lowest possible (sensible) limit for that variable and 100 the highest. So for temp we would go for -56C to +120C, and for pH from 1 to 14 - or something like that. I could have some one start creating this if everyone agrees it is a good idea.