maxplanck-ie / snakepipes

Customizable workflows based on snakemake and python for the analysis of NGS data
http://snakepipes.readthedocs.io
381 stars 85 forks source link

[Request] looping inputs for multiple ChIP-seq comparisons #1013

Closed sunta3iouxos closed 4 weeks ago

sunta3iouxos commented 5 months ago

If my samplSheet has more than 2 conditions CSAW fails. Is it possible to split the samples by condition so that CSAW will do a one to one comparison? example of sampleSheet:

name    condition
A006200387_220455_S14   ATF4
A006200387_220458_S15   H3K4me3
A006200387_220462_S17   ATF4
A006200387_220464_S18   H3K4me3
A006200387_220471_S21   H3K4me3_ChIP
A006200387_220473_S22   ATF4_ChIP
A006200387_220477_S24   H3K4me3_ChIP
A006200387_220479_S25   ATF4_ChIP
A006200387_220484_S27   ATF4_fix
A006200387_220488_S29   ATF4_fix

Genrich nicely creates a narrowPeak using the noted ShIP files and the provided Input/IGG files. Thank you.

katsikora commented 5 months ago

Hi, for multiple comparisons, the sample sheet would have to look like that:

name[\t]condition[\t]group sample1[\t]control[\t]group1 sample2[\t]control[\t]group1 sample3[\t]treatment[\t]group1 sample4[\t]treatment[\t]group1 sample5[\t]control[\t]group2 sample6[\t]control[\t]group2 sample7[\t]treatment[\t]group2 sample8[\t]treatment[\t]group2

In this example, the sample sheet would be split into two groups, each containing 2 control and 2 treatment samples, and two instances of differential binding analysis would be run.

Also, you have the option to use the same control for all treatment samples: in that case you can set the "group" column value for these samples to "All".

Hope this helps, Best wishes,

Katarzyna