UPHL-BioNGS / Cecret

Reference-based consensus creation
MIT License
49 stars 26 forks source link

Add tools specific for MonkeyPox #103

Closed DrB-S closed 1 year ago

DrB-S commented 2 years ago

Please add tools specific for working with MonkeyPox

erinyoung commented 2 years ago

Nexclade supposedly has some for MonkeyPox, although I haven't had the chance to test this.

Most MPX sequencing currently involves unselected WGS or bait WGS at the moment. There may be a combined bait/amplicon library prep method. I currently don't have anything for a combine bait/amplicon library prep method, BUT!

Cecret should still work. Once I get my hands on additional MPX samples I can be more confident in its analysis and provide a profile.

These params would need to be adjusted in a config file (or a really long command line):

params.reference_genome  = <path to NC_063383.1>
params.gff_file = <path to corresponding gff file>
params.trimmer = 'none'
params.minimum_depth  = 100
params.nextclade_dataset  = 'hMPXV'
params.kraken2_organism  = 'Monkeypox virus'

# SARS-CoV-2 specific processes would need to be turned off
params.pangolin = false
params.freyja  = false
params.freyja_aggregate = false

# vader should work on MPX, but I don't know if there's a container with those files
params.vadr = false
DrB-S commented 2 years ago

Thanks!

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***

On Jun 30, 2022, at 11:48 AM, Young @.***> wrote:

Nexclade supposedly has some for MonkeyPox, although I haven't had the chance to test this.

Most MPX sequencing currently involves unselected WGS or bait WGS at the moment. There may be a combined bait/amplicon library prep method. I currently don't have anything for a combine bait/amplicon library prep method, BUT!

if the sequencing was on an Illumina instrument if bait enriched or general WGS Cecret should still work. Once I get my hands on additional MPX samples I can be more confident in its analysis and provide a profile.

These params would need to be adjusted in a config file (or a really long command line):

params.reference_genome = params.gff_file = params.trimmer = 'none' params.minimum_depth = 100 params.nextclade_dataset = 'hMPXV' params.kraken2_organism = 'whatever the species/genus is called in kraken2 - probably MonkeyPox'

SARS-CoV-2 specific would need to be turned off

params.pangolin = false params.freyja = false params.freyja_aggregate = false

vader should work on MPX, but I don't know if there's a container with those files

params.vadr = false — Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/103#issuecomment-1171559703, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJT736AZMQQOLYXVARDVRXTXDANCNFSM52J6STUA. You are receiving this because you authored the thread.

-- CONFIDENTIALITY NOTICE:  This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law.  It is intended only for the person(s) to whom it is addressed.  If you have received this communication in error, please do not retain or distribute it.  Please notify the sender immediately by e-mail at the address shown above and delete the original message.  Thank you.  

erinyoung commented 2 years ago

The fasta for the reference can be found at https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 I wanted to give you the URL for the gff file as well, but that's going to take a second.

DrB-S commented 2 years ago

I have that reference, but couldn’t find a gff. I was going to try creating one with Prokka.

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***

On Jun 30, 2022, at 12:02 PM, Young @.***> wrote:

The fasta for the reference can be found at https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 I wanted to give you the URL for the gff file as well, but that's going to take a second.

— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/103#issuecomment-1171572142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJUH4WG5J7ZI4I3LRW3VRXVN5ANCNFSM52J6STUA. You are receiving this because you authored the thread.

-- CONFIDENTIALITY NOTICE:  This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law.  It is intended only for the person(s) to whom it is addressed.  If you have received this communication in error, please do not retain or distribute it.  Please notify the sender immediately by e-mail at the address shown above and delete the original message.  Thank you.  

DrB-S commented 2 years ago

Downloaded reference fasta, GenBank, and gff files from NCBI: https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***

On Jun 30, 2022, at 12:18 PM, Stephen Beckstrom-Sternberg @.***> wrote:

I have that reference, but couldn’t find a gff. I was going to try creating one with Prokka.

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @. @.>

On Jun 30, 2022, at 12:02 PM, Young @. @.>> wrote:

The fasta for the reference can be found at https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 https://www.ncbi.nlm.nih.gov/nuccore/NC_063383.1 I wanted to give you the URL for the gff file as well, but that's going to take a second.

— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/103#issuecomment-1171572142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJUH4WG5J7ZI4I3LRW3VRXVN5ANCNFSM52J6STUA. You are receiving this because you authored the thread.

-- CONFIDENTIALITY NOTICE:  This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law.  It is intended only for the person(s) to whom it is addressed.  If you have received this communication in error, please do not retain or distribute it.  Please notify the sender immediately by e-mail at the address shown above and delete the original message.  Thank you.  

erinyoung commented 2 years ago

There's now a monkeypox subworkflow with built-in defaults.

SARS-CoV-2 is still the default, but setting

params.species = 'mpx'

should set the correct reference genome fasta and gff file, vadr, kraken2, nextclade/nextalign, and iqtree2 outgroup.

There's also a monkeypox profile, which is for metagenomic sequencing. This changes the minimum depth to 10X (since it's not amplicon based) and turns off primer trimming as well as setting the species to mpx.

DrB-S commented 2 years ago

Will it also automatically add the next align_options for genes, so that I do not need to use my own config file?

Thanks so much!

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***

On Aug 2, 2022, at 12:29 PM, Young @.***> wrote:

There's now a monkeypox subworkflow with built-in defaults.

SARS-CoV-2 is still the default, but setting

params.species = 'mpx' should set the correct reference genome fasta and gff file, vadr, kraken2, nextclade/nextalign, and iqtree2 outgroup.

There's also a monkeypox profile, which is for metagenomic sequencing. This changes the minimum depth to 10X (since it's not amplicon based) and turns off primer trimming as well as setting the species to mpx.

— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/103#issuecomment-1203127113, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJRMD5XBUYMMTJZIZWTVXFZKPANCNFSM52J6STUA. You are receiving this because you authored the thread.

-- CONFIDENTIALITY NOTICE:  This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law.  It is intended only for the person(s) to whom it is addressed.  If you have received this communication in error, please do not retain or distribute it.  Please notify the sender immediately by e-mail at the address shown above and delete the original message.  Thank you.  

erinyoung commented 2 years ago

it's going to use the default nextalign options, so if there are specific mpx genes that you're interested in focusing on, you'd need to specify those.

DrB-S commented 1 year ago

Got it. Thanks!

Stephen M. Beckstrom-Sternberg, PhD Bioinformatics Contractor

Arizona State Public Health Lab Arizona Department of Health Services Cell: (602) 653-5011 Email: @.***

On Aug 2, 2022, at 2:17 PM, Young @.***> wrote:

it's going to use the default nextalign options, so if there are specific mpx genes that you're interested in focusing on, you'd need to specify those.

— Reply to this email directly, view it on GitHub https://github.com/UPHL-BioNGS/Cecret/issues/103#issuecomment-1203222441, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTVLJQ2K6LEGT65WJENMF3VXGF45ANCNFSM52J6STUA. You are receiving this because you authored the thread.

-- CONFIDENTIALITY NOTICE:  This e-mail is the property of the Arizona Department of Health Services and contains information that may be PRIVILEGED, CONFIDENTIAL, or otherwise exempt from disclosure by applicable law.  It is intended only for the person(s) to whom it is addressed.  If you have received this communication in error, please do not retain or distribute it.  Please notify the sender immediately by e-mail at the address shown above and delete the original message.  Thank you.