jts / ncov-tools

Small collection of tools for performing quality control on coronavirus sequencing data and genomes
MIT License
47 stars 16 forks source link

Rule 'make_sample_qc_summary': Hard-coded filenames #12

Closed dfornika closed 4 years ago

dfornika commented 4 years ago

I've been having a bit of trouble running the all_qc_summary target.

It seems that the make_sample_qc_summary rule makes a few assumptions about filenames that aren't connected to the config.yaml

https://github.com/jts/ncov-tools/blob/8271770d662dbcf07d0f4f12611ad82781d0f398/qc/Snakefile#L333-L338

It seems to assume that input files are in a directory named data but most other rules seem to use the data_root value from the config file.

It also assumes that the consensus fasta is named like {sample}.primertrimmed.consensus.fa but other rules use the consensus_pattern value from the config file.

jts commented 4 years ago

Thanks for the report Dan, we’ll have a look.

Jared

On Jul 21, 2020, at 6:05 PM, Dan Fornika notifications@github.com wrote:

 I've been having a bit of trouble running the all_qc_summary target.

It seems that the make_sample_qc_summary rule makes a few assumptions about filenames that aren't connected to the config.yaml

https://github.com/jts/ncov-tools/blob/8271770d662dbcf07d0f4f12611ad82781d0f398/qc/Snakefile#L333-L338

It seems to assume that input files are in a directory named data but most other rules seem to use the data_root value from the config file.

It also assumes that the consensus fasta is named like {sample}.primertrimmed.consensus.fa but other rules use the consensus_pattern value from the config file.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

rdeborja commented 4 years ago

Hi Dan. I've just submitted a PR to address the hardcoded filenames and to stay consistent with the other file lookup functions in the Snakefile. As for the trouble's you've been running into, the current implementation for the all_qc_summary works with Illumina data only. I have some modifications to support nanopore runs and am currently implementing and testing them with the Snakefile for pipeline integration. Which platform are you currently running?

dfornika commented 4 years ago

Thanks @rdeborja we're running both MinION and illumina.

rdeborja commented 4 years ago

@dfornika were you getting a specific error when running the qc pipeline? Also, did it happen for both Illumina and nanopore runs? I would like to try and replicate the problem before submitting any more changes.

jts commented 4 years ago

I believe this is now fixed with the updated QC modules.