Open Minishlink opened 3 years ago
Also noticed there are ~4% of the POIs that have duplicate coordinates (after filtering duplicate IDs)
Thanks, no duplicate IDS should be impossible so thats a bug in the API results. There is only one instance of 155399 in the database and that's a a unique constraint,
Duplicate coordinates, possible or even likely but if you are using the raw data set they may not all have a published status. Some imports contain 50% duplicates and we then merge these into POIs. We are relying less on imports nowadays though.
Yes the duplicate IDs were a caching failure. Our APIs servers sync independently sync from the master API and one of the API servers appears to have started serving bad results - the caches have now been reset.
Thanks! Did you update the export with the caching fix?
With the latest export as of now, I have the following data (after filtering non live POI or POI have no coordinates or POI that have AddressCleaningRequired):
[ 174114, 174118, 174050, 174120, 147781, 16324, 173863, 170575, 169578, 127045 ]
As for the coordinates, after filtering these duplicated IDs, I have:
[ 3485, 3486, 3487, 3488, 3489, 3490, 3493, 3494, 3495, 3496 ]
It seems to me that these are low precision coordinates, and that may explain the duplicates, I'll look more into it tomorrow
Here is a more precise sample list of IDs with duplicated coordinates :
[ 3344, 3485 ], [ 3345, 3486 ], [ 3346, 3487 ], [ 3347, 3488 ], [ 3348, 3489 ]
Thanks, I think you've found a bug in our caching system - the exported data comes from one of the caches. The duplicates should be entirely identical but f not one would have a greater DateLastStatusUpdate. I'll get this fixed soon, thanks for finding the problem!
Regarding the duplicate coordinates with low OCM ID, they are old data from 10 years ago- ultimately if nobody (users) wants to clean up the data and remove duplicate positions etc then it just doesn't get cleaned up. Over the years we have developed deduplication techniques during imports but that data was from long before that.
Hello,
It seems the poi ids/uuids are not uniques in the dataset. I see several instances of the POI
155399
for example. And you can see it in the API results too:https://api.openchargemap.io/v3/poi/?output=json&chargepointid=155399&maxresults=200
How do you choose what POI to display? Do you take the first you encounter, or is it based on more elaborate filtering?
Thanks