osmcode / libosmium

Fast and flexible C++ library for working with OpenStreetMap data.
https://osmcode.org/libosmium/
Boost Software License 1.0
467 stars 113 forks source link

read_relations() w/ progress bar does not work as expected #312

Closed sphoto closed 4 years ago

sphoto commented 4 years ago

For certain projects I use read_relations() in conjuction with MultipolygonManager:

osmium::area::MultipolygonManager<osmium::area::Assembler> mp_manager{assembler_config, filter};
const std::size_t file_size = osmium::file_size(input_file.filename());
osmium::ProgressBar progress_bar{file_size, osmium::isatty(2)};
osmium::relations::read_relations(progress_bar, input_file, mp_manager);

The progress bar is already at 100% (reader.offset() = EOF) after the first or second reader.read(), while the loop while (auto buffer = reader.read()) { ... } is not yet completed. My expection would be, that the progress bar is only at 100% when the loop is finished, otherwise a progress_bar.update(reader.offset()) within the loop makes no sense?!

I put an additional cout into read_relations() to get the current reader.offset():

        template <typename ...TManager>
        void read_relations(osmium::ProgressBar& progress_bar, const osmium::io::File& file, TManager&& ...managers) {
            static_assert(sizeof...(TManager) > 0, "Need at least one manager as parameter.");
            osmium::io::Reader reader{file, osmium::osm_entity_bits::relation};
            while (auto buffer = reader.read()) {
                std::cout << "read offset: " << reader.offset() << std::endl;     // <- ADDED
                progress_bar.update(reader.offset());
                osmium::apply(buffer, std::forward<TManager>(managers)...);
            }
            reader.close();
            (void)std::initializer_list<int>{
                (std::forward<TManager>(managers).prepare_for_lookup(), 0)...
            };
            progress_bar.file_done(file.size());
        }

This is what I've got (44006348 = EOF):

read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
read offset: 44006348
...
read offset: 44006348
joto commented 4 years ago

For several reasons the progress bar can only show a very rough estimate of the real progress. This is due to multithreaded reading and buffering and, in this case, because the relations are only such a small part of all the data in the file, so they are all in the buffer before you know it. It works reasonably well in many cases, but in some cases it is really not that usable. I don't know how to make this better, sorry.

sphoto commented 4 years ago

If you can't do it better, leave it as it is. This is not that important and more a "cosmetic" thing than a real issue. Anyway, thank you for the explanation.