nextstrain / seasonal-flu

Scripts. config, and snakefiles for seasonal-flu nextstrain builds
44 stars 26 forks source link

Integrating FASTA Sequence into Influenza A/H3N2 Evolution Analysis and Visualizing in Nextclade #147

Closed sekhwal closed 8 months ago

sekhwal commented 8 months ago

I'm looking to integrate my FASTA sequence into the evolutionary analysis of Influenza A/H3N2 using Nextclade and visualize it appropriately. While I utilized Nextclade for analysis, I encountered difficulties in adding the year information to the x-axis of the phylogenetic tree. Any suggestion would be appreciated.

joverlee521 commented 8 months ago

Hi @sekhwal,

As @rneher stated in the discussion forum, Nextclade does not support time-scaled trees so you will have to run a full Nextstrain phylogenetic workflow to create the time-scaled tree.

We are currently lacking documentation on how to run the seasonal flu workflow with custom sequences. The easiest way to get started for now will be to follow the Quickstart with GISAID data.

sekhwal commented 8 months ago

Thank you for your help. However, I could not find EpiFlu" link in the top navigation bar at GISAID (https://gisaid.org/). I am not sure if I have to register at GISAID to get EpiFlu link. Also, please let me know how to get "profiles/gisaid/builds.yaml" and please provide a template to prepare "builds.yaml" that would be great.

On Mon, Feb 12, 2024 at 2:16 PM Jover Lee @.***> wrote:

Hi @sekhwal https://github.com/sekhwal,

As @rneher https://github.com/rneher stated in the discussion forum https://discussion.nextstrain.org/t/how-to-add-years-on-x-axis-using-nextclade/1550, Nextclade does not support time-scaled trees so you will have to run a full Nextstrain phylogenetic workflow to create the time-scaled tree.

We are currently lacking documentation on how to run the seasonal flu workflow with custom sequences. The easiest way to get started for now will be to follow the Quickstart with GISAID data https://github.com/nextstrain/seasonal-flu?tab=readme-ov-file#quickstart-with-gisaid-data .

— Reply to this email directly, view it on GitHub https://github.com/nextstrain/seasonal-flu/issues/147#issuecomment-1939376283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGHR4LXUCZJZXIG7RLORUUDYTJTBFAVCNFSM6AAAAABDDE66U2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZZGM3TMMRYGM . You are receiving this because you were mentioned.Message ID: @.***>

joverlee521 commented 8 months ago

However, I could not find EpiFlu" link in the top navigation bar at GISAID (https://gisaid.org/). I am not sure if I have to register at GISAID to get EpiFlu link.

You will need to register at GISAID in order to access and download data from them.

Also, please let me know how to get "profiles/gisaid/builds.yaml" and please provide a template to prepare "builds.yaml" that would be great.

You can start with the existing profiles/gisaid/builds.yaml file in this repo.

sekhwal commented 8 months ago

I have some more follow-up questions.

  1. Downloading the sequences from GISAID takes very long time, also it allows only 20,000 sequences to download. Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

nextstrain build . --configfile profiles/gisaid/builds.yaml \ --use-conda --conda-frontend mamba

  1. In addition, should I download "seasonal-flu" Github repo?

  2. In builds.yaml, do I need to change anythings in the following part? Where I should provide the metadata file?

    reference: "config/h3n2/{segment}/reference.fasta" annotation: "config/h3n2/{segment}/genemap.gff" tree_exclude_sites: "config/h3n2/{segment}/exclude-sites.txt" clades: "config/h3n2/ha/clades.tsv" subclades: "config/h3n2/ha/subclades.tsv" auspice_config: "config/h3n2/auspice_config.json"

joverlee521 commented 8 months ago

Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

Following the Quickstart with GISAID data, please move your downloaded files to data/h3n2/metadata.xls and data/h3n2/raw_sequences_ha.fasta.

In addition, should I download "seasonal-flu" Github repo?

Yes, you will need to download the seasonal flu repo to run the workflow.

In builds.yaml, do I need to change anythings in the following part?

Try using the default values first to produce the build. Then if you would like to make adjustments, you can edit the parameters in the builds.yaml file.

joverlee521 commented 8 months ago

Closing since the conversation has continued in https://github.com/nextstrain/seasonal-flu/issues/149.