Hoohm / dropSeqPipe

A SingleCell RNASeq pre-processing snakemake workflow
Creative Commons Attribution Share Alike 4.0 International
147 stars 47 forks source link

adding cutadapt rule in test data config to get it runable #46

Closed seb-mueller closed 6 years ago

seb-mueller commented 6 years ago

This is a small amendment towards getting the .test data running requiring cut adapt rules.

Note, lanes pooling doesn't work at this time, I couldn't figure out how the data has to be named (sample_L00*_R1_001.fastq.gz or sample_R1_001.fastq.gz or even different). It seems to be incompatible with rule_fastqc and/or others, could you fix that as well? Thanks!

Hoohm commented 6 years ago

Yes, as for now, the preparing of multilines fastq is not working as intended. I wanted to split the prepare from the normal pipeline so that you could use it if you want to, but it wouldn't be the default path.

I haven't figured out how to integrate it. If you have suggestions, I'm all ears. One way that I thought of, if a input function that would determine which format it is, recognize it and then run the appropriate merging.

seb-mueller commented 6 years ago

Yes, automatic recognition should be rather straight forward for NexSeq 500 etc machines since they conform to the _L00*_R1_001.fastq.gz. Alternatively, this could be as you said an optional step. If the data is in multiple lanes (this can be checked using the filenames as above) and this is omitted a error/warning could be thrown to pointing out that this step should be run to give to user more control.

Hoohm commented 6 years ago

I've added the changes to develop. Didn't use you PR because there were a few more changes to add to the config.yaml and it seemed easier to just make the change locally.

Going to work on cleaning up the branches today. Should come up with a clean "gitflow" strucutre with this and we can start implementing different ides easily.