replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
40 stars 17 forks source link

--minLength and --maxLength are overwritten when using the 1200 bp amplicon scheme (--primerV V1200) flag #116

Closed k8r3l closed 3 years ago

k8r3l commented 3 years ago

Hi,

The setting of the min and max length are overwritten by the pipeline, but I can't seem to retrace why.

I set the following

nextflow run replikation/poreCov --fastq_pass {/my/dir/to/}fastq_pass/ -r 0.8.0 --primerV V1200 --rapid --minLength 250 --maxLength 1400 --output {/my/dir/to/}results/ -profile local,docker --cores 32 --memory 128

But it is overwritten:

1200 bp amplicon scheme is used [--primerV V1200]
  --minLength set to 500bp
  --maxLength set to 1500bp

Any help would much appreciated πŸ™πŸΌ . Thanks in advance!


Full list of settings from the pipeline:

Profile:             local,docker
Current User:        root
Nextflow-version:    21.04.0
poreCov-version:     0.8.0

Pathing:
...

______________________________________
Parameters:
Primerscheme:        V1200 [--primerV]
Medaka model:        r941_min_high_g360 [--medaka_model]
Update Pangolin?:    false [--update]
CPUs to use:         32 [--cores]
Memory in GB:        128 [--memory]
______________________________________

1200 bp amplicon scheme is used [--primerV V1200]
  --minLength set to 500bp
  --maxLength set to 1500bp
______________________________________

RKI output for german DESH upload:
Output stored at:    ...
Min Identity to NC_045512.2: 0.90 [--seq_threshold]
Min Coverage:        20 [ no parameter]
Proportion cutoff N: 0.05 [--n_threshold]
replikation commented 3 years ago

hi,

yes this is currently "hardcoded" for V1200 and i did not have the time yet to parameterize it for the V1200 primers and properly test it. I keep the issue open until it's included. However, it should not affect your results I believe?

But thanks for reporting, ill add this.

k8r3l commented 3 years ago

Hi,

Awesome! πŸ™‡πŸΌ

Well, for a few samples we know that by filtering away too many of the shorter reads, it causes them to dip below the threshold and fail in the final call. The ones who have been analysing these samples have set the minimum length at 250 and for some samples of poorer RNA quality they still manage to make reliable calls while we are filtering out too much. Thus it would be nice to have the flexibility.

Anyway, we'll continue as is, but we're looking forward to the feature.

Many thanks for the tool!

replikation commented 3 years ago
IFIK-virology commented 3 years ago

This would be great if it's included...the epi2me labs SARS-CoV-2 notebook allows 150 and 1200, which gives definitely more sequencing depth.

replikation commented 3 years ago

@IFIK-virology hi yes it will be included in the next release its currently in our testing loop with the latest guppy changes (#117 )

replikation commented 3 years ago
IFIK-virology commented 3 years ago

Hi,

The cut-off of 500 and 1200 nt is based on the native barcoding kit, but we use the rapid barcoding kit (gains more traction), which gives fragment sizes from 150 till 1200 nt. Therefore with the current parameters used for filter reads in the Artic step results in a loss of ΒΌ of read data (e.g., 5 Mb from a total of 20Mb). Which is quite significant.

Bw, Ronald

From: Christian Brandt @.> Reply to: replikation/poreCov @.> Date: Wednesday, 9 June 2021 at 12:50 To: replikation/poreCov @.> Cc: Ronald Dijkman @.>, Mention @.***> Subject: Re: [replikation/poreCov] --minLength and --maxLength are overwritten when using the 1200 bp amplicon scheme (--primerV V1200) flag (#116)

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/replikation/poreCov/issues/116#issuecomment-857592940, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKNO5N23DITFEIDYV4ENYDDTR5BODANCNFSM45EXMUXA.