hoelzer-lab / rnaflow

A simple RNA-Seq differential gene expression pipeline using Nextflow
GNU General Public License v3.0
93 stars 20 forks source link

Issue with comparisons.csv file #234

Closed mglgc closed 6 months ago

mglgc commented 7 months ago

Hi, First of all, thank you for the free availability to all community of this great development. Under a -profile local, docker I'm trying to use the --deg option with a comparisons.csv file barely containing lines of comma-separated comparison pairs labelled exactly as specified in the Condition column of my input.csv file, but just starting to run the following error message is showing and the running stops: "The comparisons from comparisons.csv do not match the sample conditions in input.csv." Obviously, I was checking the right matching between the conditions labels from both comparisons.csv and input.csv files. There are some words in the pipeline docs talking about "The first line is a required header" but nothing is said about the proper content of such a required header. Please, could you provide us with some real example for the content of a comparisons.csv file?

hoelzer commented 7 months ago

Hey, thanks for your interest in the pipeline!

Your comprisons.csv file should look like this:

Condition1,Condition2
mock,treated

where the first line Condition1,Condition2 is a required header that should not be changed.

Your input.csv might look like this (for paired-end reads)

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1_1.fastq,/path/to/reads/mock1_2.fastq,mock,A,0
mock_rep2,/path/to/reads/mock2_1.fastq,/path/to/reads/mock2_2.fastq,mock,B,0
mock_rep3,/path/to/reads/mock3_1.fastq,/path/to/reads/mock3_2.fastq,mock,C,0
treated_rep1,/path/to/reads/treat1_1.fastq,/path/to/reads/treat1_2.fastq,treated,A,0
treated_rep2,/path/to/reads/treat2_1.fastq,/path/to/reads/treat2_2.fastq,treated,B,0
treated_rep3,/path/to/reads/treat3_1.fastq,/path/to/reads/treat3_2.fastq,treated,C,0

again, the first row is a required header that you should keep the same. In column four you have the Condition labels that should match those used in the comparisons.csv.

See also the README (https://github.com/hoelzer-lab/rnaflow).

If it does still not work, can you share your input files or the content of them so I can have a look?

mglgc commented 6 months ago

Thanks Martin for your promptly reply. I'm guessing that issue is coming from the unmatching first capital letter between the input.csv and comparisons.csv headers, i.e. "Condition" and "condition", respectively. Things like those are the reason why a real example for specifying file content formats is the better and practical way to document as you have just to do in your reply. Thank you. I'll try and I let you know the result.

mglgc commented 6 months ago

It works. Once again, thank you so much Martin.

hoelzer commented 6 months ago

great! Glad it worked