drolbr / Overpass-API

A database engine to query the OpenStreetMap data.
http://overpass-api.de
GNU Affero General Public License v3.0
690 stars 90 forks source link

Cut off history for some (few) elements #724

Open drolbr opened 2 months ago

drolbr commented 2 months ago

On the current public instance the request

timeline(node,26366177);
foreach
{
  retro (u(t["created"]))
  {
    node(26366177);
    out meta;
  };
};

starts off version 13, but should have also versions 10 to 12. A distinct search

[date:"2019-04-18T00:00:00Z"];
node(35.7409129, 51.3476477,35.7409129, 51.3476477);
out meta;

shows that it is present in the database but not found.

The bug has been spotted while checking upcoming code in map_file_replicator.test.cc to reconstruct the map files from the data files: the side-by-side comparison proves the existing attic map file to be somewhat broken, cause so far unknown.

Most likely the data can be fixed by replicating the map files from the data files once the bug in the updating code has been fixed.

mmd-osm commented 2 months ago

The symptoms look similar to the issue I have reported in #661, though it was impacting generated clone files back then.

In a backup from April 2023, versions 10-12 are also absent in the query results. So this issue may be around for some time already.

OSMCha is still running a fairly old version "Overpass API 0.7.57.1 74a55df1", and even there the issue can be reproduced.

drolbr commented 2 months ago

That is a good observation. But the node 4379348552 in question there is unaffected here.

This bug here is quite well-located: there are both missing and surplus entries in nodes_attic.map and no failed entries in nodes.map amid the data being present in nodes_attic.bin as it should, so it is highly likely that there is a bug in the index computation of attic files.

The comparison of the existing nodes_attic.map with the nodes_attic.next.map replicated from the existing nodes.bin has found 16488843 differences. Probing a random sample of them points towards that all errors are in the existing map file.

drolbr commented 2 months ago

Some context of the map file replication: this may the map files can be taken out from the clone download, making the clone download smaller. In the long run, I hope to also shave off a couple of other redundant files such that the actual download shrinks to 120 GB or so for the geodata and 240 GB or so for full attic data. The redundant files are then replicated after download and before applying updates.

mmd-osm commented 2 months ago

Probing a random sample of them points towards that all errors are in the existing map file.

That's actually good news. I also found all 4 versions both in the node metadata and changelog files.

As a summary: Versions 10-12 all have the same index, then when moving from version 12 to 13, the index value changes. Version 14 again has the same index as version 13:

Index Node id Version Timestmap
61dec1cf 26366177 10 2008-11-10T20:36:32Z
61dec1cf 26366177 11 2016-10-13T19:42:05Z
61dec1cf 26366177 12 2019-04-17T05:15:15Z
61dec1ce 26366177 13 2019-07-04T23:50:28Z
61dec1ce 26366177 14 2019-08-23T03:39:28Z

I think in this case, the attic map file should contain the value 0xff for node 26366177, and the node_attic_indexes file should contain the list of all attic indexes. However, we're only seeing the index for version 13 in the map attic file, and no entries in node_attic_indexes. This way, we're losing the information, that versions 10-12 have another index.

Does this sound right as description of the current situation?

making the clone download smaller

This seems like a good idea. I'm wondering, if users who are not interested in attic data could instead download a planet PBF and let the import run overnight. That's a 75GB download only, and the PBF could then be used for other scenarios as well, like rendering tiles or geocoding.

mmd-osm commented 2 months ago

So when regenerating nodes_attic.map and node_attic_indexes.bin, I'm getting correct results for the query above.

For some reason, I don't see differences in node_attic_indexes.bin after node 7802847084 on my local clone. This node was created and last changed in August/September 2020. This corresponds to almost 3 years without issues. I'm running some heavily modified code, that's why I'm not sure if this applies to upstream.

Did you notice something similar in your db as well? Maybe the issue was fixed at some point as a side effect of some other change, and now only the data in attic before the change is still broken.

encoding remark: Please enter your query and terminate it with CTRL+D.
timeline(node,26366177);
foreach
{
  retro (u(t["created"]))
  {
    node(26366177);
    out meta;
  };
};
runtime remark: Timeout is 180 and maxsize is 536870912.
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.59.120 (mmd) f69e9c3e">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2023-05-02T09:41:23Z"/>

  <node id="26366177" lat="35.7408824" lon="51.3478475" version="10" timestamp="2008-11-10T20:36:32Z" changeset="732655" uid="44786" user="makracht"/>
  <node id="26366177" lat="35.7408824" lon="51.3478475" version="11" timestamp="2016-10-13T19:42:05Z" changeset="42879891" uid="4449060" user="Khalil Laleh"/>
  <node id="26366177" lat="35.7409129" lon="51.3476477" version="12" timestamp="2019-04-17T05:15:15Z" changeset="69294792" uid="2886100" user="DharmaBum"/>
  <node id="26366177" lat="35.7409096" lon="51.3473848" version="13" timestamp="2019-07-04T23:50:28Z" changeset="71913089" uid="2886100" user="DharmaBum"/>
  <node id="26366177" lat="35.7409183" lon="51.3472721" version="14" timestamp="2019-08-23T03:39:28Z" changeset="73647115" uid="5864298" user="ali dini"/>

</osm>
drolbr commented 2 months ago

Yes, the description is correct. There are other variants of errors but not discussed here.

Indeed, the newest incident in the data is from August 2020. No development at all has happened around that time. So there is a chance that the underlying bug has been fixed coincidentally, but there is no plausible candidate for the relevant chance. Keeping therefore the bug open for the moment.