Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
364 stars 81 forks source link

Prothint - Error: Inflate error #789

Open tanpham15 opened 8 months ago

tanpham15 commented 8 months ago

Dear authors,

I got the error at prothint step.

# Thu Mar 21 12:48:30 2024: Calling prothint.py...
# Thu Mar 21 12:48:30 2024: starting prothint.py
/data/scratch/mpx586/github/gene_predict/ProtHint/bin//prothint.py --threads=2 --geneMarkGtf /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/GeneMark-ES/genemark.gtf /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/genome.fa /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/braker1/proteins.fa

Here is warning message from beginning

 WARNING: empty line was removed! This warning will be supressed from now on!
#*********
# Wed Mar 20 14:18:30 2024: check_fasta_headers(): Checking fasta headers of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz
#*********
# WARNING: Detected whitespace in fasta header of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
#*********
# WARNING: Detected | in fasta header of file /data/scratch/mpx586/Batesia_hypochlora/RNA/braker3/orthodb/Arthropoda.fa.gz. This may later on cause problems! The pipeline will create a new file without spaces or "|" characters and a genome_header.map file to look up the old and new headers. This message will be suppressed from now on!
#*********
#*********
 WARNING: empty line was removed! This warning will be supressed from now on!
#*********
# Wed Mar 20 14:18:30 2024: Assuming that this is not a DNA fasta file because other characters than A, T, G, C, N, a, t, g, c, n were contained. If this is supposed to be a DNA fasta file, check the content of your file! If this is supposed to be a protein fasta file, please ignore this message!
# Wed Mar 20 14:18:30 2024: Assuming that this is not a protein fasta file because other characters than AaRrNnDdCcEeQqGgHhIiLlKkMmFfPpSsTtWwYyVvBbZzJjOoUuXx were contained. If this is supposed to be DNA fasta file, please ignore this message.
#*********
# WARNING: something seems to be wrong with the newline character! This is likely to cause problems with the braker.pl pipeline! Please adapt your file to UTF8! This warning will be supressed from now on!

Note:

Could you please take a look and let me know how can I solve this problem. Thank you very much

tanpham15 commented 7 months ago

I found the issue: using "Arthropoda.fa.gz"

Protein need to be unzipped before running.

Please close the question. Thank you very much