Closed ctb closed 8 years ago
@biobenkj says:
In terms of what I would like to teach,
I was thinking of doing everything on Amazon EC2 (unless Davis has free HPC
access?) and following this sort of format:
-Basic command line operations (traversing file hierarchies, changing file names, removing things, etc.)
-Getting onto an EC2 machine
-Quality control of short reads
-Reference-based bacterial RNA-seq analysis (shameless plugging of my workflow ;) and others) * an aside on paying attention for potential batch effects in data
-Variant calling
-Command line BLAST?
-Some bioinformatics one-liners from Stephen Turner: http://www.gettinggeneticsdone.com/2013/10/useful-linux-oneliners-for-bioinformatics.html
Data Carpentry has a set of genomics lessons, still a bit in development, that would be relevant https://github.com/datacarpentry/?utf8=✓&query=genomics
Maybe the shell-genomics and cloud-computing-genomics in particular
It seems like the focus here is on working with a microbial genomes rather than community data? Microbial genomics is still a little broad though. It might be worth deciding what we want people to know by the end of the workshop.
Maybe the theme could be 'so you just sequenced a microbial genome'? Then you could do:
I think we could do either:
or:
day 1: quality control of reads & genome assembly & annotation w/prokka. day 2: RNAseq analysis & differential expression followthrough
Preferences? I feel like the latter is more useful in this day and age but would appreciate comments and alt thoughts :)
I can put together QC, assembly, variant calling, and annotation, but would appreciate help with RNAseq and diff expr follow through. Do you have a data set or three already handy, @biobenkj?
Absolutely I have a few data sets we can use. I prefer the latter and can aid with the RNAseq and DGE follow through.
Great! I'll put together a page.
@biobenkj @tracykteal could you do a quick once-over? http://dib-training.readthedocs.org/en/pub/2015-09-24-microbes.html - note, can edit at https://github.com/dib-lab/dib-training/blob/pub/2015-09-24-microbes.rst
:) The page looks great. Made a couple typo corrections. I will make the rst files for the online tutorial and look for comments.
In terms of RNAseq analysis and DGE I was thinking something like:
Day 2:
Thoughts?
On Tue, Sep 01, 2015 at 07:17:09AM -0700, Ben Johnson wrote:
:) The page looks great. Made a couple typo corrections. I will make the rst files for the online tutorial and look for comments.
OK - are you familiar with reST/sphinx? We can use Markdown too if you prefer.
Here's how I've been doing things:
2015-may-nonmodel.readthedocs.org https://github.com/ngs-docs/2015-may-nonmodel
and I can give you repo access to ngs-docs, or whatever.
Note tutorial on ReadTheDocs: https://github.com/ngs-docs/angus/blob/2015/week3/CTB_github_editing.rst
...but I'm happy to set all of that up, just produce the docs :) :).
In terms of RNAseq analysis and DGE I was thinking something like:
Day 2:
- Setting up your project and getting your data
- QC and trimming
- Picking a workflow(s)
- Considerations for reference and non-reference based RNAseq analysis
- Using the assembly from previous day?
- Generate gtf annotation file
- Improving the reference with RNAseq
- Considerations for confounding variables (e.g. batch effects) and how to look for them (more tools!)
- Useful visualization for results (scatter plots, degust! - http://vicbioinformatics.com/degust/, etc.)
+1 sounds good.
--titus
Thanks, page looks good, and I like the focus on genome assembly and annotation with transcriptomics after that.
It seems like things are pretty well mapped out. Are there any components that would be helpful to have me teach, or @biobenkj do you have things mapped out already? What's the plan for the differential expression analysis component? DESeq for some R, or too much?
@tracykteal I don't have things explicitly mapped out yet, but really liked the comparison between common DGE methods (DESeq2, edgeR, and limma/voom) from Meeta (https://github.com/ngs-docs/msu_ngs2015/blob/master/hands-on.Rmd) used in NGS '15 alumni week. In my own work/experience I primarily use edgeR over DESeq2 as I find that getting things into the ExpressionSet object can be a pain, correcting for confounding variables can be more straight forward, and prefer the underlying assumptions utilized for DGE.
For the DGE analysis section I was thinking of explaining some of the fundamental considerations for differential expression:
Anything in particular you would like to teach? What workflow do you use when analyzing bacterial RNAseq data? Are there visualization tools that you really like? Incorporating some of the Data Carpentry R and even command line lessons might be useful, though the workshop is 9 a.m. to 3 p.m. with a lunch break.
Thoughts?
On Tue, Sep 01, 2015 at 09:33:41AM -0700, Ben Johnson wrote:
@tracykteal I don't have things explicitly mapped out yet, but really liked the comparison between common DGE methods (DESeq2, edgeR, and limma/voom) from Meeta (https://github.com/ngs-docs/msu_ngs2015/blob/master/hands-on.Rmd) used in NGS '15 alumni week. In my own work/experience I primarily use edgeR over DESeq2 as I find that getting things into the ExpressionSet object can be a pain, correcting for confounding variables can be more straight forward, and prefer the underlying assumptions utilized for DGE.
For the DGE analysis section I was thinking of explaining some of the fundamental considerations for differential expression:
- Replicate number
- Negative binomial distribution
- Confounding variables (e.g. batch effects)
- Whether you can even do DGE (e.g. sample grouping with MDS plot)
- ...other things?
Anything in particular you would like to teach? What workflow do you use when analyzing bacterial RNAseq data? Are there visualization tools that you really like? Incorporating some of the Data Carpentry R and even command line lessons might be useful, though the workshop is 9 a.m. to 3 p.m. with a lunch break.
Thoughts?
+1 for your schedule, Ben. I don't think there's time to do more than mention R as something people might like to learn; we will have plenty of workshops for people who want to learn more.
@ctb I am reasonably familiar with reST/Sphinx and will have a look at how you've been doing things for the May 2015 non-model-organism workshop.
I will follow(ish) the tutorial to generate the docs.
Any issue with doing all the RNA-seq analysis in a Jupyter notebook on EC2 for the workshop?
Our previous experience has been that mixing shell and Python confuses everyone.
On Sep 19, 2015, at 1:38 PM, Ben Johnson notifications@github.com wrote:
Any issue with doing all the RNA-seq analysis in a Jupyter notebook on EC2 for the workshop?
— Reply to this email directly or view it on GitHub.
(so, short answer, I don't think it'll work well. We've had better luck running shell commands in the shell, and graphing/plotting in ipynb.)
On Sat, Sep 19, 2015 at 01:38:03PM -0700, Ben Johnson wrote:
Any issue with doing all the RNA-seq analysis in a Jupyter notebook on EC2 for the workshop?
Reply to this email directly or view it on GitHub:
https://github.com/dib-lab/dib-training/issues/5#issuecomment-141705691
C. Titus Brown, ctbrown@ucdavis.edu
Alright. Sounds good.
Coordinate with @tracykteal and @biobenkj.