cfarkas / SARS-CoV-2-freebayes

Analysis of SARS-CoV-2 genome variants collected with freebayes variant caller
MIT License
8 stars 3 forks source link

pyfaidx.FastaIndexingError: and [E::fai_build3_core] #3

Open Krysasp opened 2 years ago

Krysasp commented 2 years ago

Hi there, I'm getting these errors when working on non-GISAID fasta

At first, in conda env, i executed: samtools faidx data/bayes/covid19-refseq.fasta #the dir i cp the reference to create the fai file ulimit -n 1000000 && ulimit SARS-CoV-2-FASTA-freebayes -f data/test/merged.fasta -g data/bayes/covid19-refseq.fasta -t 12

The output:

fixing names in FASTA Splitting fasta files with faidx (python) Split is done. Continue with FASTA alignments Aligning fasta files to reference and call variants with freebayes (option C 1)

[E::fai_build3_core] Failed to open the file data/bayes/2nd/workdir/covid19-refseq.fasta [faidx] Could not build fai index data/bayes/2nd/workdir/covid19-refseq.fasta.fai`

And i also installed from sources without conda, but when executing: `SARS-CoV-2-processing-fasta -f data/test/merged.fasta -g data/bayes/covid19-refseq.fasta -t 12

The output:

fixing names in FASTA file [ERRO] xopen: no content Splitting fasta files with faidx (python) Traceback (most recent call last): File "/home/ihcm-ubuntu/anaconda3/bin/faidx", line 33, in sys.exit(load_entry_point('pyfaidx==0.6.0', 'console_scripts', 'faidx')()) File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/cli.py", line 195, in main File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/cli.py", line 19, in write_sequence File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/init.py", line 998, in init File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/init.py", line 435, in init File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/init.py", line 607, in build_index File "/home/ihcm-ubuntu/anaconda3/lib/python3.9/site-packages/pyfaidx-0.6.0-py3.9.egg/pyfaidx/init.py", line 579, in build_index pyfaidx.FastaIndexingError: The FASTA file merged.GISAID.fasta does not contain a valid sequence. Check that sequence definition lines start with '>'

i have tried reinstall pyfaidx (from 0.6.5 downgraded to current 0.6.0), latest sratoolkit, samtools, htclibs and other libs etc., and also tried with a fasta containing sequences with headers in GISAID format. Could there anything wrong be with my custom sequence header or was it dependencies failed to properly installed?

Hope to seek out a solution to this...

cfarkas commented 2 years ago

Hi @Krysasp

Sorry for not being responsive, I was out of town. Thank you for using our repository. From the error [E::fai_build3_core] Failed to open the file data/bayes/2nd/workdir/covid19-refseq.fasta it seems like covid19-refseq.fasta file cannot being found. Also, can you please share a name of one of the fasta files that you want to merge, including the header? I am suspecting a bug with one of the scripts.

Thank you, from now and on, I will try to answer ASAP

Carlos