USDA-VS / vSNP3

vSNP -- validate SNPs
GNU General Public License v3.0
5 stars 1 forks source link

Error in Test step 2 with AF2122 (Mycobacterium bovis) #9

Closed sophiehoyer closed 3 months ago

sophiehoyer commented 3 months ago

Hello!

I have downloaded vsnp3 on my MacOS (M2 Chip) using conda and I am currently trying to Run test with AF2122 (Mycobacterium bovis).

I was able to download the test files, add references (with vsnp3_path_adder.py), and complete step 1, as instructed.

However, I am running into an error when I run:

vsnp3_step2.py -wd . -a -t Mycobacterium_AF2122

Error message:

Traceback (most recent call last):
  File "/Users/sh/opt/miniconda3/envs/bioinfo/bin/vsnp3_step2.py", line 405, in <module>
    group = Group(cwd=global_working_dir, metadata=args.metadata, defining_snps=args.defining_snps, excel_remove=args.remove_by_name, gbk_list=args.gbk, dataframes=vcf_to_df.dataframes, all_vcf=args.all_vcf, find_new_filters=args.find_new_filters, no_filters=args.no_filters, qual_threshold=int(args.qual_threshold), n_threshold=int(args.n_threshold), mq_threshold=int(args.mq_threshold), abs_pos=args.abs_pos, group=args.group, debug=args.debug)
  File "/Users/sh/opt/miniconda3/envs/bioinfo/bin/vsnp3_group_on_defining_snps.py", line 363, in __init__
    tree = self.raxml_table_build(group)
  File "/Users/sh/opt/miniconda3/envs/bioinfo/bin/vsnp3_group_on_defining_snps.py", line 534, in raxml_table_build
    tables.build_tables()
  File "/Users/sh/opt/miniconda3/envs/bioinfo/bin/vsnp3_fasta_to_snps_table.py", line 182, in build_tables
    with open(self.tree, 'rt') as tree_file: #must be the single line newick format.  Not Nexus which will be mutliline often with formating
FileNotFoundError: [Errno 2] No such file or directory: '/Users/sh/vsnp3_test_dataset/AF2122_test_files/step2/Mbovis-01/Mbovis-01_2024-06-13_14-32-24.tre'

How can I fix this? Thanks for your help!

stuber commented 3 months ago

The steps you’ve taken suggest you’ve done everything correctly. In your Mbovis-All folder, do you have a .fasta file with sequences in it?

My best guess is that there’s an issue with vsnp finding raxml. See what you get when you type raxml and then press the Tab key twice to see what your environment has loaded. Go to the folder where the raxml types are, then make a soft link ln -s from one of the raxml types to "raxml". It will look something like this:

which raxmlHPC
cd ~/.conda/envs/vsnp3/bin
# this will show the different raxml types available.
ls raxml*
# then soft link a type. You may need to try a couple.
ln -s raxmlHPC raxml

vsnp will recognize the raxml and run that pointer to the type it is linked to.

I’m not 100% sure why a raxml type is not always found from a conda install. If you have any knowledge as to why, please share.

This should fix the issue if raxml is the cause. If this does not work, let me know.

sophiehoyer commented 3 months ago

That worked like a charm! I used the soft link you suggested, raxml -> raxmlHPC.

Thanks so much for your help Tod.