Closed goldenmole1 closed 2 months ago
can you provide us the data you work on --sequence-db img_data_12990-26/2636416040/2636416040.genes.fna or if it's too big just the beginning of the file
The authors of the TXSScan (https://github.com/macsy-models/TXSScan) models do not provide GA-threshold for these 2 profiles, but macsyfinder have a mechanism to switch to use hmmsearch e-value in that case. What you see in the log is just a warning not an error.
Thank you so much. The file looks like this: head img_data_12990-26/2636416040/2636416040.genes.fna
2639022136 Ga0070510_0001 1..190(-)(Ga0070510_11) [Chitinophaga arvensicola DSM 3695] TTACAATGGAGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGGCAGGCC TAATACATGCAAGTCGAGGGGCAGCACAGGTAGCAATACTGGGTGGCGAC CGGCAAACGGGTGCGGAACACGTACGCAACCTTCCTTCAAGCGGGGAATA GCCCAGAGAAATTTGGATTAATACCCCATAAGAATGTGGA
2639022137 Ga0070510_0002 730..1494(-)(Ga0070510_11) [Chitinophaga arvensicola DSM 3695] ATGAACAGGTACTTTATAGAAGTAGGATATAAGGGGGCGCAGTACAGCGG GTTCCAGGTACAGGAAAATGCACATTCCGTACAGGCGGAGATTGACAGGG CGCTGGGTATATTATTCCGGTCGCCCATAGAAACTACGGGATCCAGCAGA
The problem come from your input data. MacSyFinder work on proteins not on genomic data. https://macsyfinder.readthedocs.io/en/latest/user_guide/input.html#input-dataset
Describe the bug I see these errors: GA bit thresholds unavailable on profile T6SSiii_tssO. Switch to e-value threshold (-E 0.100000) GA bit thresholds unavailable on profile T6SSiii_tssQ. Switch to e-value threshold (-E 0.100000)
To Reproduce Steps to reproduce the behavior:
!/bin/bash
SBATCH --mem=500G
SBATCH -c 16
source ~/anaconda3/etc/profile.d/conda.sh conda activate macsyfinder_env macsyfinder --e-value-search 0.1 --sequence-db img_data_12990-26/2636416040/2636416040.genes.fna -o macsyfinder_test_2636416040 --models-dir TXSScandir/ --models TXSScan all --db-type ordered_replicon -w 16
Expected behavior I ran the exact same script with hundreds of bacterial genomes that have reported T6SSs, but MacSyFinder failed to find any T6SS genes from any of these genomes.
Screenshots Macsyfinder 2.1.3 using:
MacsyFinder is distributed under the terms of the GNU General Public License (GPLv3). See the COPYING file for details.
If you use this software please cite: Néron, Bertrand; Denise, Rémi; Coluzzi, Charles; Touchon, Marie; Rocha, Eduardo P.C.; Abby, Sophie S. MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes. Peer Community Journal, Volume 3 (2023), article no. e28. doi : 10.24072/pcjournal.250. https://peercommunityjournal.org/articles/10.24072/pcjournal.250/ and don't forget to cite models used: macsydata cite
command used: /clusterfs/jgi/groups/science/homes/heejungcho/anaconda3/envs/macsyfinder_env/bin/macsyfinder --e-value-search 0.1 --sequence-db img_data_12990-26/2636416040/2636416040.genes.fna -o macsyfinder_test_2636416040 --models-dir TXSScandir/ --models TXSScan all --db-type ordered_replicon -w 16
models used: TXSScan-1.1.3
######################### Searching systems ########################## Models Parsing MacSyFinder's results will be stored in working_dirmacsyfinder_test_2636416040 Analysis launched on img_data_12990-26/2636416040/2636416040.genes.fna for model(s):
###################### Computing best solutions ######################
####### Writing down results in 'macsyfinder_test_2636416040' ######## No Systems found in this dataset. END
Please complete the following information):
OS:
-Linux
MacSyFinder Version:
Macsyfinder 2.1.3 using:
MacsyFinder is distributed under the terms of the GNU General Public License (GPLv3). See the COPYING file for details.
If you use this software please cite: Néron, Bertrand; Denise, Rémi; Coluzzi, Charles; Touchon, Marie; Rocha, Eduardo P.C.; Abby, Sophie S. MacSyFinder v2: Improved modelling and search engine to identify molecular systems in genomes. Peer Community Journal, Volume 3 (2023), article no. e28. doi : 10.24072/pcjournal.250. https://peercommunityjournal.org/articles/10.24072/pcjournal.250/ and don't forget to cite models used: macsydata cite
Additional context Add any other context about the problem here.