kingsfordgroup / sailfish

Rapid Mapping-based Isoform Quantification from RNA-Seq Reads
http://www.cs.cmu.edu/~ckingsf/software/sailfish
GNU General Public License v3.0
124 stars 45 forks source link

Not getting multiple columns with multiple fastq files #96

Closed upendrak closed 8 years ago

upendrak commented 8 years ago

Hi, i am using the latest version of sailfish - 0.9.2 and for some reason when i try to run the sailfish quant with multiple single end files, i expected to get multiple NumReadscolumns for multiple fastq files. Here is what i am running..

/sailfish/bin/sailfish index -t test.fa -o index

/sailfish/bin/sailfish quant -i index -l "IU" -r test_1.fq test_2.fq -o test_out

head test_out/quant.sf Name Length EffectiveLength TPM NumReads Bra000001 1884 1683.61 0 0 Bra000002 1446 1245.61 0 0 Bra000003 1458 1257.61 0 0 Bra000004 898 697.611 0 0 Bra000005 1625 1424.61 0 0 Bra000006 483 282.671 0 0 Bra000007 943 742.611 0 0 Bra000008 1407 1206.61 0 0 Bra000009 1081 880.611 0 0

Am i doing something wrong here?

rob-p commented 8 years ago

Hi,

When you pass multiple files to sailfish, it interprets these as multiple sets of reads from the same sample (concatenation the essentially). If you want to treat subsets of reads as separate samples (I.e. You want counts and tpms for each, you must run sailfish separately.

Two small notes. "IU" is for paired end libraries. If you want to treat these as single end, you can jus use U. Also, 0.10.1 is the newest release, in case you want to upgrade (available under releases above).

Best, Rob

upendrak commented 8 years ago

Thanks Rob for answering my question. Running sailfish on multiple samples are fine but i was wondering if there are any plans to combine them all in a single output in future releases. For now i will write a wrapper script.

I was testing Sailfish initially with pair-end reads and so i forgot to change the lib-type from "IU" to "U" for SE reads. Thanks for pointing out that.

I will test the 0.10.1 pretty soon.

Thanks, Upendra

rob-p commented 8 years ago

Hi Upendra,

This is not currently a feature we've been considering, since the added utility seems marginal. Specifically, many people have individual samples that consist of multiple fastq files, and so we'd have to invent some additional way of specifying multiple libraries (potentially with multiple output directories?). The only real benefit that I can see to this approach (as opposed to just providing a simple wrapper script as part of the sailfish distribution) is that if we processed multiple samples in one run, we could avoid having to load the index into memory more than once, which could provide a little speed boost. However, this might also be achieved by allowing memory mapping of the index. If there's substantial desire for this feature, we may consider adding it.

--Rob

upendrak commented 8 years ago

Hi Rob I already have a working a wrapper script that accepts multiple fastq files (both single and paired) and generates corresponding multiple output directories. I haven't benchmarked it yet against individual files but i am very happy with how it worked. Also i am a Science Informatician at CyVerse and we have couple of people interested in using Sailfish and so i have installed this tool in our GUI infrastructure (Discovery Environment) and here is the wiki for the same. I have already tested it and made sure that it is according to expected. So if you want to post this in your google group for Sailfish and let other people know about this resource, then that would be awesome. Let me know if you have any questions. Thanks, Upendra

rob-p commented 8 years ago

Hi Upendra,

Are you able to share the underlying script itself? If so, it may be of broad interest to the community of people using sailfish. In that case, I'd be happy to incorporate a stand-alone version into the repository and add some usage instructions to the documentation. Let me know if you can share it.

Best, Rob

upendrak commented 8 years ago

Hi Rob,

Here is the github repo that contains the Dockerfile along with the wrapper script for Sailfish-0.9.2. Please also let the users know about the Sailfish_align_quan-0.9.2 app (that does both index and quant in the same app). Some screen shots are attached for better understanding...

Let me know if you have any questions.

Thanks, Upendra

On 1 August 2016 at 19:39, Rob Patro notifications@github.com wrote:

Hi Upendra,

Are you able to share the underlying script itself? If so, it may be of broad interest to the community of people using sailfish. In that case, I'd be happy to incorporate a stand-alone version into the repository and add some usage instructions to the documentation. Let me know if you can share it.

Best, Rob

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kingsfordgroup/sailfish/issues/96#issuecomment-236777507, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkEEkjVoNdSAS4ujvZhnV9XZe-nr4Lyks5qbq3vgaJpZM4JYud8 .


Upendra Kumar Devisetty, Ph.D. Science Analyst, CyVerse Bio5 Institute, University of Arizona, Tucson

Phone: (530)-601-3850 Email: upendra@cyverse.org SkypeID: upendra_35

upendrak commented 8 years ago

Sorry forgot to include the link. Here is the link - https://github.com/iPlantCollaborativeOpenSource/docker-builds/tree/master/sailfish/0.9.2

On 1 August 2016 at 20:22, upendra kumar Devisetty < upendrakumar.devisetty@googlemail.com> wrote:

Hi Rob,

Here is the github repo that contains the Dockerfile along with the wrapper script for Sailfish-0.9.2. Please also let the users know about the Sailfish_align_quan-0.9.2 app (that does both index and quant in the same app). Some screen shots are attached for better understanding...

Let me know if you have any questions.

Thanks, Upendra

On 1 August 2016 at 19:39, Rob Patro notifications@github.com wrote:

Hi Upendra,

Are you able to share the underlying script itself? If so, it may be of broad interest to the community of people using sailfish. In that case, I'd be happy to incorporate a stand-alone version into the repository and add some usage instructions to the documentation. Let me know if you can share it.

Best, Rob

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kingsfordgroup/sailfish/issues/96#issuecomment-236777507, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkEEkjVoNdSAS4ujvZhnV9XZe-nr4Lyks5qbq3vgaJpZM4JYud8 .


Upendra Kumar Devisetty, Ph.D. Science Analyst, CyVerse Bio5 Institute, University of Arizona, Tucson

Phone: (530)-601-3850 Email: upendra@cyverse.org SkypeID: upendra_35


Upendra Kumar Devisetty, Ph.D. Science Analyst, CyVerse Bio5 Institute, University of Arizona, Tucson

Phone: (530)-601-3850 Email: upendra@cyverse.org SkypeID: upendra_35