Closed endixk closed 2 years ago
I got the same error as the developer from funannotate. It seems to be related to the newer GCC/Ubuntu version. It does not work with Ubuntu 22.04, at least by compiling. I figured that out by using BUSCO.
@KatharinaHoff @MarioStanke May I ask you to take a look? I would assume a required library changed slightly.
I just tried with the current version of Augustus the following and it worked:
cd Augustus/docs/tutorial/data
msa2prfl.pl --prefix_from_seqnames --max_entropy=0.75 --blockscorefile=PF00225_seed.blocks.txt PF00225_seed.txt > PF00225_seed.prfl
fastBlockSearch --cutoff=1.1 chr4.103M.fa PF00225_seed.prfl
I need more information to reproduce the problem and then try to fix it: Please make the files and command lines that produced the input also available.
I have the same problem with the conda installation of Augustus.
My command is
augustus --codingseq=1 --proteinprofile=28538at7147.prfl --predictionStart=18091799 --predictionEnd=18101912 --species=fly NT_033777.3.temp
and the error is
augustus: ERROR
PP::Profile: Error parsing pattern file"28538at7147.prfl", line 8.
As in the case above (https://github.com/Gaius-Augustus/Augustus/issues/346#issue-1288236984) this was the line following a [dist]
block. Once I removed that block, a new error pointed to the next [dist]
block. After removing all the [dist]
sections in the profile file the command worked. I attach both file versions, 28538at7147_problem.prfl
and 28538at7147_ok.prfl
.
Additional info (may or may not be helpful):
I tried with multiple build versions from conda across v3.4.0 and also v3.3.3 and I got the same error. Curiously, I had previously installed build version augustus-3.4.0-pl5321h877ab46_5
back in March and this installation worked fine. When I re-installed this version in a new environment today it failed.
Also of interest is this issue: https://github.com/nextgenusfs/funannotate/issues/724 It seems that the issue is very similar and was only reported in May.
The error above is also being reported by BUSCO users:
@LarsGab It looks like this exception is thrown in Profile::parse_stream. Can you please take this up?
Hi,
I tried to reproduce this error with the latest version of Augustus from GitHub and the data provided by @berkelem. I ran Augustus on two different machines with different versions of Ubuntu and gcc, it worked fine in both cases. Have you tried running it with Augustus from GitHub? Otherwise, it might be a problem with the Augustus version uploaded to Bioconda. Best, Lars
I used the Github version, be more precise:
git clone https://github.com/Gaius-Augustus/Augustus.git /opt/mosga/tools/augustus
cd /opt/mosga/tools/augustus/
git checkout b69e6bccfd46b4c7452407aafb2d6a6077e60ab8
The problem has been circumvented for me since BUSCO 5 switched to MetaEuk instead of using Augustus. That's why I, unfortunately, can not provide more information to reproduce the issue, and it appeared in an intermediate development step. Usual Augustus executions run fine.
Yes the Github version seems to be fine, but the Bioconda version is causing problems for BUSCO. Most users use either the Conda or Docker distributions of BUSCO and both rely on the Augustus version on Bioconda for the Augustus pipeline. Can you reproduce the error with conda?
In my case, I had the issue WITH the Github version of BUSCO and Augustus, without any conda environment. Install at a Ubuntu 22.04 system BUSCO 4 and the mentioned Augustus Github version, and download all required libraries from apt and cpan. That should recover the situation.
@berkelem
Most users use either the Conda or Docker distributions of BUSCO and both rely on the Augustus version on Bioconda for the Augustus pipeline. Is there any evidence for that since multiple people have detected the issue?
I encountered a similar problem running Augustus with BUSCO evidently caused by a change in the behavior of std::ws
in new versions of libstdc++
. It seems that std::ws
now sets the failbit
if the eofbit
is already set.
I was using Augustus 3.2.3, but it looks like the code still expects the old behavior on the master branch. I was able to fix the problem with a patch like this:
diff --git a/src/pp_profile.cc b/src/pp_profile.cc
index ce9613f1..f0f60610 100644
--- a/src/pp_profile.cc
+++ b/src/pp_profile.cc
@@ -672,8 +672,10 @@ void Profile::parse_stream(istream & strm) {
// read in the allowed distance range
istringstream lstrm(readAndConcatPart(strm, type, lineno));
DistanceType addDist;
- if(!(lstrm >> addDist >> ws && lstrm.eof()))
- throw ProfileParseError(lineno - newlinesFromPos(lstrm.str(), lstrm.tellg()) -1);
+ lstrm >> addDist;
+ if (!(lstrm.eof() || lstrm >> ws)) {
+ throw ProfileParseError(lineno - newlinesFromPos(lstrm.str(), lstrm.tellg()) -1);
+ }
finalDist += addDist;
} else // if dist is not specified, assume arbitrary distance
finalDist.setInfMax();
I think the logic here should work for either behavior of std::ws
, but admittedly I haven't tested carefully.
Thanks, Andrew. That may explain why the problem came up recently and I couldn't reproduce it before upgrading my computer. Thanks for the code. Lars, I reproduced the problem on Ubuntu 22.04 on my laptop and on cs3 with the current master branch. Can you please first reproduce and fix it?
Thanks a lot, Andrew! You pointed me in the right direction. I was able to reproduce the error on our cluster and indeed the std::ws
is the problem, as Andrew explained. Your solution fixes the issue of incorrectly raising the ProfileParseError
, but it doesn't catch incorrectly formatted distance intervals.
Removing std::ws
from the original if clause seems to fix the problem, and the error is still handled as intended.
I have created a pull request addressing the problem.
Thanks for addressing this issue! Can you make a new conda build with this fix?
Hi, I encountered an error when I provide a protein profile to the program.
Running
fastBlockSearch <seq> <prfl>
gives this message:Running
augustus --species=<species> --proteinprofile=<prfl> <seq>
gives this:I found this kind of information block from the corresponding line:
After removing all these
[dist]
information from the profile, the program ran without an error. Nevertheless, I do want to include these information, which might be non-negligible in some occasions.I didn't experience this kind of problem from the previous builds of
augustus
. e.g. Using conda build3.4.0 pl5262h5a9fe7b_2
runs without an error with same input files.I will be much appreciated if you can give a quick check and hopefully solve the issue soon.
Thanks!
Daniel