Closed Chrisxie03 closed 7 months ago
Hi @Chrisxie03
Thank you for your message and providing the log messages. Sorry to see that you're experiencing issues with ViroConstrictor. In your message/logs i'm reading two independent issues. One is a bug (the auto-updater) and the other is what i believe to be mostly a documentation issue.
First of all, the auto-updating error you're seeing is a bug, so thank you for bringing this to our attention. The program exits because it believes it's missing necessary configuration setting, this isn't right however and this specific configuration setting shouldn't be required.
It can be worked around pretty easily though.
To fix it, please delete any existing configuration(s) that you have with rm ~/.ViroConstrictor_*
.
Then run the pipeline as normal, you'll be prompted with the first-time setup questions again. For the auto-updating prompt please answer with "no", and subsequently for the asking-to-update prompt answer this one with "yes".
From this point onwards ViroConstrictor will not completely automatically update to every new version. However you'll get a yes/no prompt whether or not you want to update to the newest available version.
With this change you should no longer experience the issue.
You can of course still run the pipeline with the --skip-updates
flag anyways, but if you do then i'd recommend you to at least update to the currently latest version (1.4.2) manually beforehand as this version has some bugfixes that should help with the rest of the analysis. This can be done with mamba update viroconstrictor
or mamba install viroconstrictor==1.4.2
I'll see if i can fix the updater-functionality for the next upcoming release.
The second error you're experiencing is unrelated, and i believe it has to do with formatting of inputs and missing proper documentation for this. In this case specifically the references.
From the log messages i see you're trying to analyze Influenza A data with the match-reference process enabled, and i'm going to assume also with the --segmented
flag/mode set to True
for these samples.
I quickly want to clarify what the various modes mean specifically.
If the match-reference
and segmented
modes are both disabled then a multi-reference analysis is still possible. ViroConstrictor will simply run the entire analysis for every possible reference. This happens on a per-sample basis.
Using this setting works if your (multi-)reference fasta looks as follows:
>A-HA-H1-NC_026433
atgaaagtaaaactactggtcctg...
>A-MP-MP-NC_026433
atgagtcttctaaccgaggtcgaa...
>A-NA-N9-NC_026429
atgaacccaaatcaaaagataata...
>A-NP-NP-NC_026436
atgagtgacatcgaagccatggcg...
>A-NS-NS-NC_007375
atggattcccacactgtgtcaagc...
if the match-reference
mode is enabled and segmented
mode is disabled then ViroConstrictor will pick one reference that fits the provided data the best. This happens on a per sample basis.
i.e. if a multi-reference file is provided for a sample that contains 100 fasta references, then ViroConstricotor will try out all these 100 references for this sample and pick one reference that fits best for this specific sample.
This setting is to be used when using (for example) a sequencing protocol that is ambiguous for various subtypes of the same non-segmented viral-target. Please see below for an example of the multi-reference fasta file.
>measles_subtype1
atgaaagtaaaactactggtcctg...
>measles_subtype2
atgagtcttctaaccgaggtcgaa...
>measles_subtype3
atgaacccaaatcaaaagataata...
>measles_subtype4
atgagtgacatcgaagccatggcg...
If the match-reference
and segmented
modes are both enabled then ViroConstrictor will search for the best fitting reference of each segment of a virus. This again happens on a per-sample basis but requires some specific formatting for the multi-reference fasta.
In this case, ViroConstrictor will choose one best-fitting reference per segment per sample.
To make this work, it is necessary that the multi-reference fasta with all the possible references for all the segments is formatted like follows:
>A.HA_01 HA|H1|H1N1
atgaaagtaaaactactggtcc...
>A.HA_02 HA|H3|H3N2
atgaagactatcattgctttga...
>A.HA_03 HA|H5|H5N1
atgaagactatcattgctttga...
>A.MP_01 MP|MP|H1N1
atgagtcttctaaccgaggtcg...
>A.MP_02 MP|MP|H3N2
atgagccttcttaccgaggtcg...
>A.MP_03 MP|MP|H5N1
atgagtcttctaaccgaggtcg...
>A.NA_01 NA|N1|H1N1
atgaacccaaatcaaaagataa...
>A.NA_02 NA|N2|H3N2
atgaatccaaatcaaaagataa...
>A.NA_03 NA|N1|H5N1
atgaatccaaatcaaaagataa...
The formatting comes down to the following structure:
Personal identifier Segment-name|Segment-subtype|Extra-information
In the final analysis, the Segment-name and your personal identifier get swapped so the folders with results won't get messy.
For your specific case, i think the formatting of the reference file was not really fitting into these setups.
If you want to analyze influenza data and the multi-reference only has one option per segment then it's fine to leave the match-reference
and segmented
modes disabled.
If you do have multiple reference options per segment then it's necessary to format the fasta headers as in the example above.
This is also where the splitting-error currently comes from. Because match-reference
and segmented
are both enabled, ViroConstrictor expects the reference-fasta to be formatted as in the example above. This isn't the case however and therefore it exits with an error.
The following last message you see can be ignored:
/mnt/studentfiles/2024/2024MBI08/mambaforge/envs/viroconstrictor/bin/ViroConstrictor:10: DeprecationWarning: The parameter "ln" is deprecated since v2.5.2. Instead of ln=1 use new_x=XPos.LMARGIN, new_y=YPos.NEXT.
sys.exit(main())
This is merely a deprecation message of something that will be replaced/fixed in a next version and this has no impact on the analysis.
I hope i was able to clear some things up for you. If you have any other questions please let us know.
Kind regards, Florian
Hi @florianzwagemaker,
Thank you for the comment, ViroConstrictor is now running!!
Kind regards,
Chris
Hi everyone, I have installed ViroContrictor version 1.4.1 via Conda. I am running a multi-sequence analysis, with the following command:
ViroConstrictor -i 'data/' -o 'output_VC1/' -samples 'samplesheet.tsv' --platform 'nanopore' -at 'end-to-end'
when running the command I get the following text in my terminal:
[08/04/24 16:36:19] INFO ViroConstrictor version: 1.4.1
[08/04/24 16:36:19] INFO Succesfully read global configuration file
[08/04/24 16:36:19] INFO Valid FastQ files were found in the input directory. ('data/')
[08/04/24 16:36:19] INFO Successfully parsed all command line arguments
[08/04/24 16:36:19] WARNING 2 Ambiguous nucleotides found in file /mnt/studentfiles/2024/2024MBI08/viroconstrictor/influenza_reference.fasta in record A-HA-H1-NC_026433: R
Please check whether this is intended.
[08/04/24 16:36:19] WARNING 1 Ambiguous nucleotides found in file /mnt/studentfiles/2024/2024MBI08/viroconstrictor/influenza_reference.fasta in record A-PB1-PB1-NC_007375: N
Please check whether this is intended.
[08/04/24 16:36:19] WARNING 1 Ambiguous nucleotides found in file /mnt/studentfiles/2024/2024MBI08/viroconstrictor/influenza_reference.fasta in record A-NA-N9-NC_026429: Y
Please check whether this is intended.
[08/04/24 16:36:19] WARNING 1 Ambiguous nucleotides found in file /mnt/studentfiles/2024/2024MBI08/viroconstrictor/influenza_reference.fasta in record A-NP-NP-NC_026436: R
Please check whether this is intended.
[08/04/24 16:36:19] WARNING 1 Ambiguous nucleotides found in file /mnt/studentfiles/2024/2024MBI08/viroconstrictor/influenza_reference.fasta in record A-PA-PA-NC_026437: R
Please check whether this is intended.
Traceback (most recent call last): File "/mnt/studentfiles/2024/2024MBI08/mambaforge/envs/viroconstrictor/bin/ViroConstrictor", line 10, in
sys.exit(main())
^^^^^^
File "/mnt/studentfiles/2024/2024MBI08/mambaforge/envs/viroconstrictor/lib/python3.11/site-packages/ViroConstrictor/main.py", line 144, in main
update(sys.argv, parsed_input.user_config)
File "/mnt/studentfiles/2024/2024MBI08/mambaforge/envs/viroconstrictor/lib/python3.11/site-packages/ViroConstrictor/update.py", line 87, in update
ask_prompt = conf["GENERAL"]["ask_for_update"] == "yes"