iqbal-lab-org / gramtools

Genome inference from a population reference graph
MIT License
92 stars 15 forks source link

bidir_search_fwd is broken #33

Closed iqbal-lab closed 6 years ago

iqbal-lab commented 8 years ago

OK, I know we don't use this function right now, so I'm raising this bug as a placeholder for when we need it.

First, I note that in bidir_search_fwd() as it currently stands, we do not clear allele_empty after passing it (by reference) into get_location, whereas we do in _bwd. Second, I note that sites() ends up with far too many entries in it, including the same site and allele more than once. Third, when you add empty_allele.clear() in the obvious places (mirroring what is done in _bwd() ), we get segfaults, and notice this:

`

2 0x0000000000437e82 in BackwardSearchTest_OneSNP_Test::TestBody (this=0x6d4840) at ./test/unittest_bidir_search_bwd_fwd.cpp:157

157 EXPECT_EQ(sites.front().front().second.front(), 1); (gdb) l 152 no_occ=(_sa_intervals.begin()).second-(_sa_intervals.begin()).first; 153 EXPECT_EQ(true,first_del); 154 EXPECT_EQ(1,sa_intervals.size()); 155 EXPECT_EQ(no_occ,1); 156 EXPECT_EQ(sites.front().front().first, 5); 157 EXPECT_EQ(sites.front().front().second.front(), 1); 158 EXPECT_EQ(sites.front().front().second.size(), 1); 159 160 161 sa_intervals.clear(); (gdb) p sites $1 = std::list = {[0] = std::vector of length 2, capacity 2 = {{first = 5, second = std::vector of length 0, capacity 0}, {first = 5, second = std::vector of length 1, capacity 1 = {0}}}}

`

Note the allele number {0} at the end - this must be wrong as we are indexing alleles 1-based, and all our _bwd() unit tests are passing with correct storing of alleles 1,2,.. etc in the sites object.

This is all true at commit 318a43b5656560fa65fdcc538a7ea5f7ea8f318b (which does not have the allele_empty.clear() added on the two lines after get_location is called).

iqbal-lab commented 8 years ago

As a result of this, I've commented out any _fwd() tests in the unit tests for now at commit bc842fa6103bde3efe4b9a0ee747e15ad17626dd

sm0179 commented 8 years ago

Yes, I'm not surprised. I did all the debugging on bidir_search_bwd and might have forgotten to mirror all the bugfixes or mirrored them blindly without considering the differences in forward search. For example, we need to keep in mind that in a forward search we are adding sites in reverse order compared to backward search. So I think back and front should be swapped in all aassertions

iqbal-lab commented 8 years ago

Yes I put the assertions swapped - and that works - ie it correctly finds the sites in the other order. So it's probably a small thing about site adding. One thing that is not clear to me is hoe lef/tright and left_rev/right_rev are supposed to work in the _fwd function.

sm0179 commented 8 years ago

[left,right) is the interval in the suffix array of the reverse string (the one the forward search is done on) and [left_rev,right_rev) is the corresponding interval in the suffix array of the normal string. hmm maybe notation is confusing

iqbal-lab commented 8 years ago

I just wasn't sure what left_new and left_rev new meant. I think I just need to sit with a clear head

ffranr commented 6 years ago

Issue no longer needed.