Closed mbrockman1 closed 6 years ago
Thanks for your detailed bug-report!
Can you make the files available to me for debugging? This doesn't need to be on GitHub in case the data sensitive, but it would really help! This is my email-address: hannes.hauswedell@fu-berlin.de
I used the master branch to build from source.
I am using a list of data sensitive sequences. It seems as if one sequence is messing it up for the rest. I need to do further investigation.
The database I used was Uniprot Sprot 20171205.
Maybe it is because I am using a significantly small database, compared to the larger ones.
Maybe it is because I am using a significantly small database, compared to the larger ones.
No, I am pretty sure it's a bug in my routines for compressing the lca-tree. I will be on vacation for the next two weeks, but I will look into it immediately after that. Sorry it has taken me so long to respond to this.
I have reproduced the issue and this patch fixes it for me:
--- a/src/search_algo.hpp
+++ b/src/search_algo.hpp
@@ -1387,7 +1387,7 @@ _writeRecord(TBlastRecord & record,
record.lcaTaxId = 0;
for (auto const & bm : record.matches)
{
- if (length(lH.gH.sTaxIds[bm._n_sId]) > 0)
+ if ((length(lH.gH.sTaxIds[bm._n_sId]) > 0) && (lH.gH.taxParents[lH.gH.sTaxIds[bm._n_sId][0]] != 0))
{
record.lcaTaxId = lH.gH.sTaxIds[bm._n_sId][0];
break;
@@ -1397,7 +1397,7 @@ _writeRecord(TBlastRecord & record,
if (record.lcaTaxId != 0)
for (auto const & bm : record.matches)
for (uint32_t const sTaxId : lH.gH.sTaxIds[bm._n_sId])
- if (sTaxId != 0) // TODO do we want to skip unassigned subjects
+ if (lH.gH.taxParents[sTaxId] != 0) // TODO do we want to skip unassigned subjects
record.lcaTaxId = computeLCA(lH.gH.taxParents, lH.gH.taxHeights, sTaxId, record.lcaTaxId);
record.lcaId = lH.gH.taxNames[record.lcaTaxId];
Could you give it a try and report back?
Thanks a lot!
This is now also in the master branch and will soon be released as 1.9.5. Please re-open this issue if it still doesn't work for you!
Versions:
After running:
I receive: