Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
355 stars 79 forks source link

Braker3/GenemarkETP evidence.gff not found #632

Open NadiaTamayo15 opened 1 year ago

NadiaTamayo15 commented 1 year ago

Hi I’m trying to run BRAKER3 an I’m doing test3.sh and I got this error


ERROR in file /srv/home/nadia/BRAKER/scripts/braker.pl at line 5473
Failed to execute: /srv/home/nadia/miniconda3/envs/braker_2att/bin/perl /srv/home/nadia/GeneMark-ETP/bin/gmetp.pl --cfg /ahul/home/nadia/braker_test/braker/GeneMark-ETP/etp_config.yaml --workdir /ahul/home/nadia/braker_test/braker/GeneMark-ETP --bam /ahul/home/nadia/braker_test/braker/GeneMark-ETP/etp_data/ --cores 8 --softmask  1>/ahul/home/nadia/braker_test/braker/errors/GeneMark-ETP.stdout 2>/ahul/home/nadia/braker_test/braker/errors/GeneMark-ETP.stderr
Failed to execute: /srv/home/nadia/miniconda3/envs/braker_2att/bin/perl /srv/home/nadia/GeneMark-ETP/bin/gmetp.pl --cfg /ahul/home/nadia/braker_test/braker/GeneMark-ETP/etp_config.yaml --workdir /ahul/home/nadia/braker_test/braker/GeneMark-ETP --bam /ahul/home/nadia/braker_test/braker/GeneMark-ETP/etp_data/ --cores 8 --softmask  1>/ahul/home/nadia/braker_test/braker/errors/GeneMark-ETP.stdout 2>/ahul/home/nadia/braker_test/braker/errors/GeneMark-ETP.stderr
The most common problem is an expired or not present file ~/.gm_key or that GeneMark-ETP didn't receive enough evidence from the input data, in this case, see errors/GeneMark-ETP.stderr!

When I enter to GENERMARK errors the error that arises is the following


error, file/folder not found: evidence.gff

I have braker version 3.0.3 and I installed Genemark from GitHub

the braker log is attached Braker.log.pdf

NadiaTamayo15 commented 1 year ago

I already tried re-installing genemark from source f, from the website , from GitHub and downloaded the key multiple times and still does not work, I have seen several similar error with GeneMark and also I run the program with breaker 2 and it work just fine.

KatharinaHoff commented 1 year ago

Have you tried running the container (singularity or docker)? The GeneMark-ETP version in the container is different from the GitHub version.

AdamStuckert commented 1 year ago

I am having this same issue. On my system I have tracked down the issue to a ProtHint error. ProtHint relies on an attribute sched_getaffinity within the os module. However, this seems to be a platform dependent attribute (see: https://stackoverflow.com/questions/42538153/python-3-6-0-os-module-does-not-have-sched-getaffinity-method). I have not fund a workaround yet, but wanted to update this issue thread for other folks.

Edit: I am using the conda distribution of Braker3, and followed installation instructions. I have determined that other versions of python contain the appropriate attribute, but have not attempted to downgrade python yet.

For clarity, this is an older version of python I have installed in another environment:

(base) python
Python 3.9.12 (main, Apr  5 2022, 06:56:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.sched_getaffinity()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: posix.sched_getaffinity() takes exactly one argument (0 given)

And the python distribution within the braker3 recipe:

(base) conda activate braker3
(braker3) python
Python 3.9.16 | packaged by conda-forge | (feeb267e, Jan 18 2023, 16:12:16)
[PyPy 7.3.11 with GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>> import os
>>>> os.sched_getaffinity()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'os' has no attribute 'sched_getaffinity'
KatharinaHoff commented 1 year ago

Please do not use the current GeneMark-ETP from GitHub. For some reason, it is not the version that BRAKER requires. Please use Docker or Singularity because the working ETP version is in there. If you do not have container systems, look at the Dockerfile. There, you find our source of GeneMark-ETP. https://github.com/Gaius-Augustus/BRAKER/blob/master/Dockerfile line 200.

On Mon, Jun 26, 2023 at 6:04 PM Adam Stuckert @.***> wrote:

I am having this same issue. On my system I have tracked down the issue to a ProtHint error. ProtHint relies on an attribute sched_getaffinity within the os module. However, this seems to be a platform dependent attribute (see: https://stackoverflow.com/questions/42538153/python-3-6-0-os-module-does-not-have-sched-getaffinity-method). I have not fund a workaround yet, but wanted to update this issue thread for other folks.

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/632#issuecomment-1607789471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JCRH5ZROSCREE73D4TXNGXJVANCNFSM6AAAAAAYTMJNNA . You are receiving this because you commented.Message ID: @.***>

AdamStuckert commented 1 year ago

Please do not use the current GeneMark-ETP from GitHub. For some reason, it is not the version that BRAKER requires. Please use Docker or Singularity because the working ETP version is in there. If you do not have container systems, look at the Dockerfile. There, you find our source of GeneMark-ETP. https://github.com/Gaius-Augustus/BRAKER/blob/master/Dockerfile line 200.

I have installed GeneMark-ETP from the sourcefile you indicated, and it yields the same result. I'm fairly certain that the python version contained in the anaconda recipe is causing this issue, as it does not contain sched_getaffinity attribute and other users are likely to have this issue.

KatharinaHoff commented 1 year ago

We are not in charge of the conda receipe. However, I am glad you figured this out. @npavlovikj , are you still maintaining the conda receipe and could temporarily fix this by using the named python version?

In addition, it would be great if you, @AdamStuckert , notified Alexandre Lomsadze and Mark Borodovsky at Georgia Tech via e-mail that ProtHint should be fixed to work with different python versions. I am not sure whether they see this issue, and they are the only ones who can fix it on the ProtHint end.

npavlovikj commented 1 year ago

@KatharinaHoff , bioconda is community based project, so anyone can add/modify recipe - it doesn't need to be only the one the created the recipe.

There is no Python version pinned in the recipe itself, https://github.com/bioconda/bioconda-recipes/blob/master/recipes/augustus/meta.yaml, so the best Python version is being installed within the environment. One can pin the specific version that works when installing augustus: conda create -n augustus augustus=3.5.0 python=3.6

Thank you, Natasha

Rio-Kashimoto commented 1 year ago

Hello. I am having same issue as above. I have installed braker3 through conda installation. I followed the ETP installation as @KatharinaHoff mentioned. However, the error message shows error, file/folder not found: evidence.gff.

I have looked my output folder, and I could not find evidence.gff from the /braker/GeneMark-ETP/rnaseq/hints/proteins.fa/prothint. My prothint folder included cmd.log prot13q3vg39 protjiwfnqhf protm2ly4f6j protplcg6_5u seed_proteins.faa diamond prot86ac9v26 protkiyay_g6 protn7wmhbvd protupq9ge6x gene_stat.yaml protg6gr45y5 protlkk8rxyd protoug2kv7m protvut2kv77

I also could not get other output files as below. Inside gmetp.pl script. my $hcc_genes = "$workdir/rnaseq/hints/$proc/complete.gtf"; my $hcp_genes = "$workdir/rnaseq/hints/$proc/incomplete.gtf"; my $rnaseq_hints = "$workdir/rnaseq/hints/hintsfile_merged.gff"; my $prothint_evi = "$workdir/rnaseq/hints/$proc/prothint/evidence.gff"; my $prothint_hints = "$workdir/rnaseq/hints/$proc/prothint/prothint.gff";

I changed the all initial perl and python path #!/usr/bin/perl to #!/home/.conda/envs/braker3/bin/perl for ETP. (Including gmetp.pl) However, it is not solved yet. Could you tell me how I can get evidence.gff files? In what step do I have error? I appreciate for your support and advice.

DustinSokolowski commented 1 year ago

Hey everyone,

Thank you for the helpful discussions. I encountered this error while trying to annotate a number of species and I'm wondering if this is a specific issue with the protein-only module.

I ran braker with default parameters + protein hints from prothints and got the same error:

image

In another species, where I have RNA-seq data, braker ran perfectly fine.

image

With this in mind, I wonder if this is a simple data transformation issue, where combining the RNA-seq and protein hints created a gff file that genemark is happy with? Braker_issue.zip

Please find the hints files of the gff files from the two species (one with RNA-seq one without) attached.

Best, Dustin

mscharmann commented 1 year ago

Hi, I just discovered another case where this same error is thrown, but for a rather simple and completely unrelated issue: my proteins.fa was gzipped. Make sure to decompress the protein file before starting braker.. Perhaps a simple sanity check early in the pipeline could help? Cheers, Mathias

jessiepelosi commented 10 months ago

I was having the same problem as well with

error, file/folder not found: evidence.gff

In the GeneMark-ETP.stderr file. This ended up being an issue with my input protein file- I had to run gmetp.pl in GeneMark-ETP outside of BRAKER to find the error as being . in published proteome data. The error provided in std out from GeneMark-ETP was not informative and had to do some digging.

ColinR01 commented 7 months ago

Based on experience, it is likely that the input protein sequence(pep) format is incorrect, as shown in the picture below. Optimize (such as delete) these sequences and try again.

![Uploading 2270b95af39bb6510d2f124022845a7.png…]()

Best, ColinR01

juliadouglasf commented 6 months ago

@ColinR01, you linked to this issue

juliadouglasf commented 6 months ago

@NadiaTamayo15, @Rio-Kashimoto, @DustinSokolowski, did any of you fix the issue of error, file/folder not found: evidence.gff? It is possible there is an issue with my input protein file, as Jessie and Colin have mentioned, but if there is, I don't know what it is. I ran GeneMark-ETP outside of BRAKER and got the same error. I cloned the GeneMark-ETP version suggested by @KatharinaHoff and still got the same error.

Rio-Kashimoto commented 6 months ago

@NadiaTamayo15, @Rio-Kashimoto, @DustinSokolowski, did any of you fix the issue of error, file/folder not found: evidence.gff? It is possible there is an issue with my input protein file, as Jessie and Colin have mentioned, but if there is, I don't know what it is. I ran GeneMark-ETP outside of BRAKER and got the same error. I cloned the GeneMark-ETP version suggested by @KatharinaHoff and still got the same error.

I resolved this error by ensuring that the IDs in the mapped RNA-seq (BAM) file match the scaffold IDs in the soft-masked genome FASTA file.

juliadouglasf commented 6 months ago

Thanks for your response, @Rio-Kashimoto. That doesn't appear to be my issue, but all these responses do point to there being some problem with file format.