kbase / project_guides

This repo contains documents and guides that describe project principles, how-to docs, etc.
MIT License
7 stars 33 forks source link

Sramakri #45

Closed srividya22 closed 9 years ago

srividya22 commented 9 years ago

Added RNASeq Requirements document

mlhenderson commented 9 years ago

Clearly we need to spec out some data types with you. Obviously you need an update to the genome type, which is basically Sam's PR. We could get you set up for bringing in the RNA-Seq data first, and then once the Genome is settled you could work with that object. I would prefer not to try to optimize data storage for a particular language, but have a more general solution that works for visualizations and processing, so we just need to define some general types that can store data in a way that is straightforward to visualize. Storing precomputed svg might be an option, it depends on how much data we're looking at.

aparkin commented 9 years ago

Also-- just want to make sure pipeline is organism agnostic. Will it work for fungi, bacteria, archaea?

srividya22 commented 9 years ago

The choice of tools mentioned in this document is the most popular among the eukaryotic RNASeq experiments.The first step is to align the RNASeq reads using Tophat (a splice junction mapper ). Alternatively for prokaryotes, we can have a choice to map it using bowtie mapper. The rest of the pipeline can be the same for both prokaryotes and eukaryotes.

aparkin commented 9 years ago

It woudl be great to map that out. It's a big win if we get something more flexible in there.

fperez commented 9 years ago

This one will be good to discuss at the SF meeting... It seems a bit too domain-specific to me to be a project-wide guide, as perhaps we should start having requirements documents like this that are more specific to certain domains/tools go into their own repositories...

aparkin commented 9 years ago

Yes... true. There should be a gene expression analysis repository with the RNA-SEQ pipeline being a subpart of that (maybe) or maybe just an RNA-SEQ repository and this would be part of the design documentation.

fperez commented 9 years ago

I'm inclined to close both this and #45. That would mean that both of these documents should become design docs in their respective technical repos, since this repo should be really about things that set fairly broad, project-wide policies and practices. It's obviously a fuzzy line, which is why it's worth discussing it, but this one seems a bit too specific.

Let's close it here, and you can move the doc into the proper tools.