Closed BenWiederhake closed 11 months ago
Where did you encounter those nested buffers? The Reader
class should only ever return unnested buffers from read()
, so the user should never see them?
Nested buffers occur internally, so this affects the running time even if the user's code doesn't have direct access to them.
Checking for errors makes even more sense in this case. The user shouldn't be able to trigger this issue even if they wanted to, and the checks make sure that no data is lost internally (e.g. by new code).
I personally encountered these because osmium::io::detail::PBFDataBlobDecoder::operator()
returns deeply nested buffers, and that's what the random access code uses / will use.
There are two issues here which we should not mix up:
begin()
can throw. So I am not too happy about that. And because this is internal to libosmium anyway, an assert could be a better solution. But this is something I have to look into.In any case these changes are orthogonal to all the other work you are doing, right?
TEST_CASE("Can quickly handle deeply nested buffer")
in test_buffer_nested.cpp
proves that this change does speed things up, so I believe it's worth to improve the code in this way.Changes since last push:
Changes since last push summary:
buffer.written() == 1360
to buffer.written() >= 1360 && buffer.written() <= 1440
, since apparently Windows writes 8 bytes more per node. This used to break windows-minimal-2019 and windows-minimal-2022.CI failure seems to be a flake: It fails long before the code of this PR becomes relevant, specifically during package installation.
Closing because you don't seem to accept any PRs at this time.
This PR:
Buffer::get_last_nested()
. This method used to be accidentally-quadratic: Each invocation takes a linear amount of time in the depth, and needs to be invoked a linear amount of times in order to iterate all buffers. This means an overall quadratic running time, even though one would expect a linear running time. I also added a test that is very sensitive to this (4s versus <0.01s). When iterating over the entire planet, I observe a speedup from 163.183 s to 160.942 s. (And an observed stddev of 0.836 s, so this result has 2.68 sigma. Physics scientists would laugh at that, but it's good enough proof for me, in this case.)