COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
780 stars 165 forks source link

Salmon Quantmerge input files #179

Open uros-sipetic opened 6 years ago

uros-sipetic commented 6 years ago

Hey, running Salmon Quantmerge requires a list of directories that contain the 'quant.sf' file. However, I (and maybe other people as well) usually change this file to append the sample ID as a prefix, so is it maybe a good idea to make the requirement be a file that ends with 'quant.sf' instead of it just being named 'quant.sf'?

EDIT: Also, currently this command works only for 'quant.sf' files, another idea would be to expand the concept to work on 'quant.genes.sf' files as well, or rather '*quant.genes.sf'.

Thanks!

rob-p commented 6 years ago

Hi @uros-sipetic,

Yup; these are both good ideas. The quantmerge command was added mainly for convenience for a use case @tseemann was interested in. However, given that it's now part of the software, we should make its usage sufficiently general.

Thanks! Rob

tseemann commented 6 years ago

I've been on a RNA DGE hiatus since then - but I will get back to it!

AMChalkie commented 6 years ago

+1 For this functionality on gene level data

antpiron commented 5 years ago

+1 For this functionality on gene level data

Pull request for this functionality: https://github.com/COMBINE-lab/salmon/pull/344

Sherry520 commented 5 years ago

+1 I do need to merge all samples's UniqueCount files to a single tsv file.

sudeep71 commented 5 years ago

I have couple of simple question on "list of dir" to supply part. 1) DO i have to supply the complete path or just names of dir as a txt file?

I tried both and it does not work! But when i try one folder in the dir it works

salmon quantmerge --quants barcode01_quant -o all_barcodes_merged.txt Version Info: This is the most recent version of salmon. [2019-10-16 14:15:06.726] [mergeLog] [info] samples: [ barcode01_quant ] [2019-10-16 14:15:06.726] [mergeLog] [info] sample names : [ barcode01_quant ] [2019-10-16 14:15:06.726] [mergeLog] [info] output column : TPM [2019-10-16 14:15:06.726] [mergeLog] [info] output file : all_barcodes_merged.txt [2019-10-16 14:15:06.726] [mergeLog] [info] Parsing barcode01_quant/quant.sf

When i try a list of all the folders

almon quantmerge --quants quant_dir_list.txt -o all_barcodes_merged.txt Version Info: This is the most recent version of salmon. [2019-10-16 14:15:54.698] [mergeLog] [info] samples: [ quant_dir_list.txt ] [2019-10-16 14:15:54.698] [mergeLog] [info] sample names : [ quant_dir_list.txt ] [2019-10-16 14:15:54.698] [mergeLog] [info] output column : TPM [2019-10-16 14:15:54.698] [mergeLog] [info] output file : all_barcodes_merged.txt [2019-10-16 14:15:54.698] [mergeLog] [critical] The sample directory quant_dir_list.txt either doesn't exist, or doesn't contain a quant.sf file

head quant_dir_list.txt barcode01_quant barcode02_quant barcode03_quant barcode04_quant barcode05_quant barcode06_quant barcode07_quant barcode08_quant

I have even tried with complete path to the dir and it fails. What am i doing wrong.

Thanks

tseemann commented 5 years ago

What does ls barcode01_quant say? There should be a quant.sf file.

sudeep71 commented 5 years ago

There are quant.sf files in each one of folders. But i get the error saying "doesn't contain quant.sf "

aux_info cmd_info.json lib_format_counts.json libParams logs quant.sf

Yago-91 commented 4 years ago

Hi everybody!! I agree that the documentation is poor regarding quantmerge option. The format of the "list" of directories is lacking. Specially for sudeep71, I provided a comma separated list enclosed in curly brackets: salmon quantmerge --quants {salmon_quant,salmon_decomp_quant} --names {gzipped,unzipped} -o quantmerge.txt

This worked for me. I hope this helps.

Hikoyu commented 4 years ago

Above list format @Yago-91 showed is also available for salmon quant to input multiple sequence files. e.g. salmon quant -l A -i reference -o result -1 {sample1_1.fastq.gz,sample2_1.fastq.gz} -2 {sample1_2.fastq.gz,sample2_2.fastq.gz}

wikiselev commented 1 year ago

Hi everybody!! I agree that the documentation is poor regarding quantmerge option. The format of the "list" of directories is lacking. Specially for sudeep71, I provided a comma separated list enclosed in curly brackets: salmon quantmerge --quants {salmon_quant,salmon_decomp_quant} --names {gzipped,unzipped} -o quantmerge.txt

This worked for me. I hope this helps.

Another option would be to use bash expansion. This is especially useful when working with single-cell data, where there are thousands of samples. Assuming that salmon folder contains output of salmon quant ... and removing any extra files from it (so that it contains only sample folders) this will work:

salmon quantmerge --quants salmon/* ...