zerebubuth / planet-dump-ng

Converts an OpenStreetMap database dump into planet files.
BSD 2-Clause "Simplified" License
30 stars 8 forks source link

Exception dumping recent planets #25

Open tomhughes opened 2 years ago

tomhughes commented 2 years ago

Planet has failed two weeks in a row now and this week I caught the output and it is throwing an exception dumping the relations to the PBF output:

Writing relations...
EXCEPTION: writer_thread(2): pbf_writer.cpp(189): Throw in function void pbf_writer::pimpl::write_blob(const google::protobuf::MessageLite&, const string&)
Dynamic exception type: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error> >
std::exception::what: Unable to write block of type OSMData, uncompressed size 33630260 because it is larger than the maximum allowed 33554432.

. Trying to continue...
EXCEPTION: pbf_writer.cpp(189): Throw in function void pbf_writer::pimpl::write_blob(const google::protobuf::MessageLite&, const string&)
Dynamic exception type: boost::wrapexcept<std::runtime_error>
std::exception::what: Unable to write block of type OSMData, uncompressed size 33723897 because it is larger than the maximum allowed 33554432.

Not sure if this is an issue in protozero (maybe @joto can help? we could rebuild against a newer version?) or whether planet-dump-ng is feeding it things that are too large.

tomhughes commented 2 years ago

Oh scratch that - that's the pbf_writer.cpp in this repository, not the protozero one ;-)

zerebubuth commented 2 years ago

Wow. I guess there's a bug in there that's massively undercounting or underestimating the byte count for relations. Short term, you could try reducing the constant in https://github.com/zerebubuth/planet-dump-ng/blob/master/src/pbf_writer.cpp#L107 to 0.125 times the max block size. It will be less efficient, but might allow the dump to proceed.

Longer term, I can look at why the relation handling isn't accurately estimating the block size.

26 Mar 2022 13:04:21 Tom Hughes @.***>:

Planet has failed two weeks in a row now and this week I caught the output and it is throwing an exception dumping the relations to the PBF output:

*Writing relations... EXCEPTION: writer_thread(2): pbf_writer.cpp(189): Throw in function void pbf_writer::pimpl::write_blob(const google::protobuf::MessageLite&, const string&) Dynamic exception type: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector > std::exception::what: Unable to write block of type OSMData, uncompressed size 33630260 because it is larger than the maximum allowed 33554432.

. Trying to continue... EXCEPTION: pbf_writer.cpp(189): Throw in function void pbf_writer::pimpl::write_blob(const google::protobuf::MessageLite&, const string&) Dynamic exception type: boost::wrapexcept std::exception::what: Unable to write block of type OSMData, uncompressed size 33723897 because it is larger than the maximum allowed 33554432. * Not sure if this is an issue in protozero (maybe @joto[https://github.com/joto] can help? we could rebuild against a newer version?) or whether planet-dump-ng is feeding it things that are too large.

— Reply to this email directly, view it on GitHub[https://github.com/zerebubuth/planet-dump-ng/issues/25], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AACCIAAZBZKC6DR4WB3PD4TVB4DNFANCNFSM5RXBLDQQ]. You are receiving this because you are subscribed to this thread.[Tracking image][https://github.com/notifications/beacon/AACCIADTRDWVJDFCH7YLBJTVB4DNFA5CNFSM5RXBLDQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4RTQCK3A.gif]

tomhughes commented 2 years ago

I've applied that change locally for now so hopefully next week will go better...

tomhughes commented 2 years ago

Looks like that worked and this week's dump is in the process of being published now.

tomhughes commented 2 years ago

Of course bumping to 1.2.4 undid that local hack and it failed again this week :-(

tomhughes commented 2 years ago

It failed again this week even with the modified limit...

zerebubuth commented 2 years ago

Urgh. Crap. Sorry about that. I've pushed v1.2.5, which should have the lower limit plus also a reduced recheck time. I'm hoping that helps, and I'll try to repro locally again.

tomhughes commented 2 years ago

Thanks. I've deployed that now.

zerebubuth commented 2 years ago

Thanks for your patience! I think I found what was causing the issue: basically relation 6677259 is very large (some versions >25k members) and has relatively many versions (about 440), so this was enough to overflow the pblock between checks of the current size.

I think this might also explain the intermittent failure, as it's possible that the recheck might have happened in the "right" place and split the history into two blocks, or in the "wrong" place and tried to collect it all into one block.

I've implemented an approximate size counter for the relation pgroups, which acts as an additional trigger for a re-check of the pblock size. On my local machine, this allowed dumping the 2022-09-12 planet to completion.

I've pushed a new version: v1.2.6. Hopefully this works :crossed_fingers:

tomhughes commented 2 years ago

Thanks. I've deployed that now.