Closed dylkot closed 5 years ago
Hi @dylkot we actually had this talk with @seb-mueller earlier this week and we will add a conversion from fasta to tsv file to do exactly this.
Great, thanks!
@dylkot I've just created a feature branch that addresses this, could you try this out?: https://github.com/Hoohm/dropSeqPipe/tree/feature/fastqc_auto_adapter
@Hoohm , could you revise the changes?
Ultimately, it creates fastqc_adapter.tsv
in the root dir based on the adapter fasta.
This is then used by FastQC instead of the default adapters.
Once ok, I'll merge it into develop.
I've also put/updated a few sensible drop-seq adapters into templates/custom_adapters.fa
. I think this could serve as a good collections to get started in drop-seq. Thoughts?
An example output adapter view of the test-data sample1_R2_fastqc.html
:
Just merged it in develop and travis did gave an error:
ModuleNotFoundError in line 58 of /home/travis/build/Hoohm/dropSeqPipe/rules/fastqc.smk:
No module named 'Bio'
Seems like the biopython
module needs to be imported.
I have already tried to add an environment into fastqc.smk
which contains biopython as below:
...
conda: '../envs/merge_bam.yaml'
run:
...
But this gave the following error:
RuleException in line 57 of /home/user/code/dropSeqPipe/rules/fastqc.smk:
Conda environments are only allowed with shell, script, or wrapper directives (not with run).
Do you know how to make biopython
available to a run
directive?
Sadly you can't, you have to create a script and make a conda env with the call to the script.
This is now integrated in the develop branch.
Currently fastqc doesn't seem to look for the adapters that are used for cutadapt. As a result it is saying that there is no adapter contamination for samples that I know have adapter contamination. Unfortunately cutadapt expects the adapters file to be in fasta format whereas fastqc wants it to be in a tab delimited format. Perhaps we add another configuration parameter for a fastqc adapter file? Or else we can convert the fasta file provided for cutadapt into a file that can be used as input for fastqc.