Closed Gian77 closed 1 year ago
I suspect your input is using UTF-16 or other character format that doesn't work with UTF-8.
If you try the file
command on your inputs, that will hopefully tell you what encoding the file uses, e.g.
$ file README.md
README.md: UTF-8 Unicode text
If it is UTF-8, we'll need the contig that causes the problem to see if we can work out what's going on. If it isn't UTF-8, then the easier solution is to convert the encoding prior to using antiSMASH with something like iconv
.
Hi @SJShaw,
thanks a lo for the fast answer. I checked all my genomes PROKKA_*.fna
files and the encodeed as ASCII
[benucci@dev-amd20 code]$ for dir in ../data/*/; do bgc=$(find $dir -type f -name "PROKKA*fna"); file -bi $bgc; done
text/plain; charset=us-ascii
text/plain; charset=us-ascii
text/plain; charset=us-ascii
...
isn't ascii a subset of UTF-8 encode? Also, they all are the same so I am not sure why antismash run on only some of the contigs files.
I read that file
is not always precise so I tried recode
as follows and gave me the same result.
[benucci@dev-amd20 code]$ for dir in ../data/*/; do bgc=$(find $dir -type f -name "PROKKA*fna"); if recode utf8/..UCS < $bgc >/dev/null 2>&1; then echo "Valid utf8 : $bgc"; else echo "NOT valid utf8: $bgc"; fi; done
Valid utf8 : ../data/PvP001-Pacbio_Pseudomonas_coleopterorum_pbio-2432.22830.bc1021_BAK8B_OA--bc1021_BAK8B_OA.ccs_results/PROKKA_10122022.fna
Valid utf8 : ../data/PvP002-Illumina_Pseudomonas_oryzihabitans_D_52634.2.402975.ATCGATCG-ATCGATCG_results/PROKKA_10242022.fna
Valid utf8 : ../data/PvP003-Illumina_Pseudomonas_syringae_52616.3.395054.CGTTGCAA-CGTTGCAA_results/PROKKA_10242022.fna
...
I can send you a couple of genomes if you give me an email address, I am sorry I cannot attach in here since they aren't public in JGi yet.
Thanks so much,
-Gian
Hi @SJShaw,
I tried to convert the ASHII into UTF-8 for one of e the genome contigs file was not working (Please see above).
[benucci@dev-amd20 testing]$ iconv -f ASCII -t UTF-8 < PROKKA_10112022.fna > PROKKA_10112022_conv.fna
[benucci@dev-amd20 testing]$ iconv -f ASCII -t UTF-8 < PROKKA_10112022.gff > PROKKA_10112022_conv.gff
Then I run antismash again those specific files as follows:
(antismash) [benucci@dev-amd20 testing]$ antismash --cpus 20 -v --taxon bacteria --genefinding-gff3 PROKKA_10112022_conv.gff --genefinding-tool none --output-dir antismash/ PROKKA_10112022_conv.fna
And I invitably got the same UnicodeDecodeError
, please see below
INFO 12/12 13:32:37 antiSMASH version: 6.1.1
INFO 12/12 13:32:37 diamond using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/diamond (0.9.24)
INFO 12/12 13:32:37 hmmpfam2 using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmpfam2 (2.3.2)
INFO 12/12 13:32:37 fasttree using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/fasttree
INFO 12/12 13:32:37 hmmsearch using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmsearch (3.1b2)
INFO 12/12 13:32:37 hmmpress using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmpress (3.1b2)
INFO 12/12 13:32:37 hmmscan using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmscan (3.1b2)
INFO 12/12 13:32:37 meme using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/meme (4.11.2)
INFO 12/12 13:32:37 fimo using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/fimo (4.11.2)
INFO 12/12 13:32:37 glimmerhmm using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/glimmerhmm
INFO 12/12 13:32:37 prodigal using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/prodigal (V2.6.3)
INFO 12/12 13:32:37 muscle using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/muscle (v3.8.1551)
INFO 12/12 13:32:38 java using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/java (11.0.13)
INFO 12/12 13:32:38 blastp using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/blastp (2.5.0+)
INFO 12/12 13:32:38 makeblastdb using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/makeblastdb (2.5.0+)
INFO 12/12 13:32:38 Parsing input sequence 'PROKKA_10112022.fna'
INFO 12/12 13:32:39 GFF3 and sequence have only one record. Assuming is the same as long as coordinates are compatible.
WARNING 12/12 13:32:41 Fasta header too long: renamed "gnl|AIT|--prefix_1" to "c00001_gnl|AIT.."
INFO 12/12 13:32:46 Analysing record: c00001_gnlAIT..
INFO 12/12 13:32:46 Detecting secondary metabolite clusters
INFO 12/12 13:32:46 Running antismash.detection.hmm_detection
INFO 12/12 13:32:46 HMM detection using strictness: relaxed
INFO 12/12 13:33:02 16 region(s) detected in record
INFO 12/12 13:33:02 Running antismash.detection.genefunctions
INFO 12/12 13:33:38 Running antismash.detection.nrps_pks_domains
INFO 12/12 13:33:49 Running antismash.modules.lanthipeptides
Traceback (most recent call last):
File "/mnt/home/benucci/anaconda2/envs/antismash/bin/antismash", line 10, in <module>
sys.exit(entrypoint())
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/__main__.py", line 125, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/__main__.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 674, in run_antismash
result = _run_antismash(sequence_file, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 734, in _run_antismash
analysis_timings = analyse_record(record, options, get_analysis_modules(), module_results)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 299, in analyse_record
run_module(record, module, options, previous_result, timings)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 271, in run_module
results = module.run_on_record(record, results, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/__init__.py", line 111, in run_on_record
return run_specific_analysis(record)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 757, in run_specific_analysis
run_lanthi_on_genes(record, gene, cluster, neighbours, results)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 705, in run_lanthi_on_genes
result_vec = run_lanthipred(record, candidate, lant_class, domains)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 577, in run_lanthipred
hmmer_profiles[lant_class], lant_class)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 513, in determine_precursor_peptide_candidate
cleavage_result = run_cleavage_site_phmm(lan_a_fasta, hmmer_profile, THRESH_DICT[lant_class])
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 478, in run_cleavage_site_phmm
return predict_cleavage_site(profile, fasta, threshold)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/lanthipeptides/specific_analysis.py", line 435, in predict_cleavage_site
hmmer_res = subprocessing.run_hmmpfam2(query_hmmfile, target_sequence)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/hmmpfam.py", line 39, in run_hmmpfam2
result = execute(command, stdin=target_sequence)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/base.py", line 95, in execute
stderr == PIPE)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/base.py", line 32, in __init__
self.stdout = stdout.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 542: invalid start byte
What should I do? I can send this contigs file to you if helps. Thanks so much! -Gian
Hello again, Can I have an email where to send the contigs? Thanks co much! Gian
If you head here and start a ticket, you can then respond to the email you get, attaching the (first?) offending contig.
@SJShaw Thanks a lot, will do it right now.
Hello @SJShaw did you get the contig that was giving me troubles? I sent it twice. Please let me know, otherwise I will attach it in here, even if unpublished should not be a big issue. Thanks much! G.
No, nothing's come through. I suspect it might be over the file size limit for the email system.
If you'd prefer a slightly more private version of sharing it, you could submit it to the antiSMASH webservice and then forward the job ID along to that ticket you created.
Hello @SJShaw ,
No worries, think sharing it here is good for now. I am sending contig files from 2 genomes for now. One worked ok with Antismash (in case you need it as a comparison), the other did not work. They are in separate folders. In total, I have about 20 contig files (out of about 110) from genomes for which Antismash gave me the same issue. If you need I can send them all (in that case maybe through email or ftp).
Thanks a lot, Gian
Well, the bad news is that both genomes work fine for me, which means it's likely an environment problem.
Neither of these have a lanthipeptide protocluster, which is the module where your example logs above have the error. For the non-working variant in your upload, could you give the stack trace that you get for it?
Edit: The non-working one does have a thiopeptide, which would use very similar logic as the lanthipeptide, where the working one doesn't.
Digging into the GFF file, I can see some URL encoded elements, like product=3'%2C5'-cyclic adenosine
, which the non-working variant has more of. The general CDS naming scheme starting with --prefix
also seems a little risky. It's possible that your particular hmmpfam2
doesn't handle those very well at all.
@SJShaw
interesting... Here is the trace...
(antismash) [benucci@dev-amd20 antismash_test]$ antismash --cpus 20 -v --taxon bacteria --genefinding-gff3 PROKKA_10242022.gff --genefinding-tool none --output-dir output/ PROKKA_10242022.fna
INFO 12/01 12:15:52 antiSMASH version: 6.1.1
INFO 12/01 12:15:52 diamond using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/diamond (0.9.24)
INFO 12/01 12:15:52 hmmpfam2 using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmpfam2 (2.3.2)
INFO 12/01 12:15:52 fasttree using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/fasttree
INFO 12/01 12:15:52 hmmsearch using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmsearch (3.1b2)
INFO 12/01 12:15:52 hmmpress using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmpress (3.1b2)
INFO 12/01 12:15:52 hmmscan using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/hmmscan (3.1b2)
INFO 12/01 12:15:52 meme using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/meme (4.11.2)
INFO 12/01 12:15:52 fimo using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/fimo (4.11.2)
INFO 12/01 12:15:52 glimmerhmm using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/glimmerhmm
INFO 12/01 12:15:52 prodigal using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/prodigal (V2.6.3)
INFO 12/01 12:15:52 muscle using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/muscle (v3.8.1551)
INFO 12/01 12:15:52 java using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/java (11.0.13)
INFO 12/01 12:15:53 blastp using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/blastp (2.5.0+)
INFO 12/01 12:15:53 makeblastdb using executable: /mnt/home/benucci/anaconda2/envs/antismash/bin/makeblastdb (2.5.0+)
INFO 12/01 12:15:53 Parsing input sequence 'PROKKA_10242022.fna'
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_1" to "c00001_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_2" to "c00002_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_3" to "c00003_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_4" to "c00004_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_5" to "c00005_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_6" to "c00006_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_7" to "c00007_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_8" to "c00008_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_9" to "c00009_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_16" to "c00010_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_17" to "c00011_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_18" to "c00012_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_19" to "c00013_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_20" to "c00014_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_21" to "c00015_gnl|AIT.."
WARNING 12/01 12:15:55 Fasta header too long: renamed "gnl|AIT|--prefix_22" to "c00016_gnl|AIT.."
INFO 12/01 12:15:57 No genes found, skipping record
INFO 12/01 12:15:57 No genes found, skipping record
INFO 12/01 12:15:58 Analysing record: c00001_gnlAIT..
INFO 12/01 12:15:58 Detecting secondary metabolite clusters
INFO 12/01 12:15:58 Running antismash.detection.hmm_detection
INFO 12/01 12:15:58 HMM detection using strictness: relaxed
INFO 12/01 12:15:59 No regions detected, skipping record
INFO 12/01 12:15:59 Analysing record: c00002_gnlAIT..
INFO 12/01 12:15:59 Detecting secondary metabolite clusters
INFO 12/01 12:15:59 Running antismash.detection.hmm_detection
INFO 12/01 12:15:59 HMM detection using strictness: relaxed
INFO 12/01 12:16:00 No regions detected, skipping record
INFO 12/01 12:16:00 Analysing record: c00003_gnlAIT..
INFO 12/01 12:16:00 Detecting secondary metabolite clusters
INFO 12/01 12:16:00 Running antismash.detection.hmm_detection
INFO 12/01 12:16:00 HMM detection using strictness: relaxed
INFO 12/01 12:16:01 No regions detected, skipping record
INFO 12/01 12:16:01 Analysing record: c00004_gnlAIT..
INFO 12/01 12:16:01 Detecting secondary metabolite clusters
INFO 12/01 12:16:01 Running antismash.detection.hmm_detection
INFO 12/01 12:16:01 HMM detection using strictness: relaxed
INFO 12/01 12:16:02 1 region(s) detected in record
INFO 12/01 12:16:02 Running antismash.detection.genefunctions
INFO 12/01 12:16:03 Running antismash.detection.nrps_pks_domains
INFO 12/01 12:16:04 Running antismash.modules.lanthipeptides
INFO 12/01 12:16:04 Running antismash.modules.lassopeptides
INFO 12/01 12:16:04 Running antismash.modules.nrps_pks
INFO 12/01 12:16:04 Running antismash.modules.sactipeptides
INFO 12/01 12:16:04 Running antismash.modules.t2pks
INFO 12/01 12:16:04 Running antismash.modules.thiopeptides
INFO 12/01 12:16:04 Running antismash.modules.tta
INFO 12/01 12:16:04 Skipping TTA codon detection, GC content too low: 49%
INFO 12/01 12:16:04 Analysing record: c00005_gnlAIT..
INFO 12/01 12:16:04 Detecting secondary metabolite clusters
INFO 12/01 12:16:04 Running antismash.detection.hmm_detection
INFO 12/01 12:16:04 HMM detection using strictness: relaxed
INFO 12/01 12:16:04 No regions detected, skipping record
INFO 12/01 12:16:04 Analysing record: c00006_gnlAIT..
INFO 12/01 12:16:04 Detecting secondary metabolite clusters
INFO 12/01 12:16:04 Running antismash.detection.hmm_detection
INFO 12/01 12:16:04 HMM detection using strictness: relaxed
INFO 12/01 12:16:05 No regions detected, skipping record
INFO 12/01 12:16:05 Analysing record: c00009_gnlAIT..
INFO 12/01 12:16:05 Detecting secondary metabolite clusters
INFO 12/01 12:16:05 Running antismash.detection.hmm_detection
INFO 12/01 12:16:05 HMM detection using strictness: relaxed
INFO 12/01 12:16:10 3 region(s) detected in record
INFO 12/01 12:16:10 Running antismash.detection.genefunctions
INFO 12/01 12:16:18 Running antismash.detection.nrps_pks_domains
INFO 12/01 12:16:20 Running antismash.modules.lanthipeptides
INFO 12/01 12:16:20 Running antismash.modules.lassopeptides
INFO 12/01 12:16:20 Running antismash.modules.nrps_pks
INFO 12/01 12:16:20 Predicting A domain substrate specificities with NRPSPredictor2
INFO 12/01 12:16:22 Predicting CAL domain substrate specificities by Minowa et al. method
INFO 12/01 12:16:22 Predicting PKS KR activity and stereochemistry using KR fingerprints from Starcevic et al.
INFO 12/01 12:16:22 Running antismash.modules.sactipeptides
INFO 12/01 12:16:22 Running antismash.modules.t2pks
INFO 12/01 12:16:22 Running antismash.modules.thiopeptides
Traceback (most recent call last):
File "/mnt/home/benucci/anaconda2/envs/antismash/bin/antismash", line 10, in <module>
sys.exit(entrypoint())
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/__main__.py", line 125, in entrypoint
sys.exit(main(sys.argv[1:]))
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/__main__.py", line 113, in main
antismash.run_antismash(sequence, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 674, in run_antismash
result = _run_antismash(sequence_file, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 734, in _run_antismash
analysis_timings = analyse_record(record, options, get_analysis_modules(), module_results)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 299, in analyse_record
run_module(record, module, options, previous_result, timings)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/main.py", line 271, in run_module
results = module.run_on_record(record, results, options)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/__init__.py", line 88, in run_on_record
return specific_analysis(record)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/specific_analysis.py", line 617, in specific_analysis
result_vec = run_thiopred(thio_feature, thio_type, domains)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/specific_analysis.py", line 528, in run_thiopred
result = determine_precursor_peptide_candidate(query, domains)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/specific_analysis.py", line 459, in determine_precursor_peptide_candidate
end, score = run_cleavage_site_phmm(thio_a_fasta, 'thio_cleave.hmm', -3.00)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/specific_analysis.py", line 418, in run_cleavage_site_phmm
return predict_cleavage_site(profile, input_fasta, threshold)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/modules/thiopeptides/specific_analysis.py", line 334, in predict_cleavage_site
hmmer_res = subprocessing.run_hmmpfam2(query_hmmfile, target_sequence)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/hmmpfam.py", line 39, in run_hmmpfam2
result = execute(command, stdin=target_sequence)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/base.py", line 95, in execute
stderr == PIPE)
File "/mnt/home/benucci/anaconda2/envs/antismash/lib/python3.7/site-packages/antismash/common/subprocessing/base.py", line 32, in __init__
self.stdout = stdout.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 545: invalid start byte
(antismash) [benucci@dev-amd20 antismash_test]$
Edit: I believe the >--prefix name thing is the standard for PROKKA I believe,... mmm.. what should I try on doing? Any idea?
Thanks much, G.
--prefix
is part of the argument you give prokka, it should be something like --prefix something
, but maybe in the runs that generated this it ended up --prefix something--prefix
? It's certainly not something I've seen in quite a lot of PROKKA outputs.
That it's falling over again in hmmpfam2 doesn't surprise me. Try running the nisin cluster that's shipped with antiSMASH as a test case, in the source tree it's antismash/test/integration/data/nisin.gbk
. If that still fails in hmmpfam2, it's definitely your environment at fault, possibly a bad/strange build of the HMMer binary.
If it works, test out the theory that it's gene naming/annotations by not using your GFF annotations, just use prodigal to do the gene finding. Run with --genefinding-tool prodigal
instead of --genefinding-gff ...
and it will still find that particular thiopeptide cluster with exactly the same genes. If it works after that, then it's your annotations.
Hello @SJShaw ,
Well, thanks so much. Prokka output is exactly the problem and in particular, the --prefix
. In my original annotation script I was assigning a variable to the --prefix
, the strain of the genome. The fact s that for some genomes, I have no strains, so prokka was weridly adding the --prefix
to the label of CDS to the files of the genomes with no strain. So werid, I removed the --prefix
from the script and now it seems to work ok - I tested on the same offending contig file I sent you.
I guess, I am going to reannotate those contigs that did not work.
Really, thanks so much for helping me out with this!
Gian
No problem, I'm glad it's resolved. We might still make some changes to protect against bad names like this.
Hello,
Describe the bug I am running antismash over a set of assembled genomes and I am having a weird error on some of the genomes - it runs fine on most, but give me this error on a few. Please see the error below.
System (please complete the following information): I am running Antismash from an HPC that mount:
Antismash run
How I tried to solve it, with no success I saw there is a similar post about this errorr, but is related to run Antismash from Docker, run on singularity, while I installed it through conda, and I am on an HPC. I tried exporting the LANG variable like
export LANG=C.UTF-8
, but it seems not to work either. Any clue? I can send one of the genome that failed to run if needed to reproduce this. Thanks much in advance! -Gian