npolar / api.npolar.no

Searchable data storage
https://api.npolar.no
4 stars 0 forks source link

Duplicates while iterating all locations in the API #78

Open mrthomassen opened 7 years ago

mrthomassen commented 7 years ago

Hi from NRK.

We use your API regularly, but we have found duplicate locations when using a smaller limit. Earlier we used 100 as the limit/page-size, like this: http://api.npolar.no/placename/?q=&filter-status=official&limit=100&start=0&sort=location+asc,ident+asc&format=json

We have tried to start with this base, following your own next-links, but we end up with lots of duplicate locations. If we change to batch-size 5000, there will be no duplicates. http://api.npolar.no/placename/?q=&filter-status=official&limit=5000&start=0&sort=location+asc,ident+asc&format=json

Best regards, Andreas Thomassen NRK

cnrdh commented 6 years ago

(sorry for the delayed response) The sort parameter is wrong, try removing "+asc" and just use the field-name to sort A-Z... and also use "area" instead of "location"

cnrdh commented 6 years ago

A little more explanation, since you sorted on a non-existing field ("location asc") there was no proper sorting, this could lead to duplicate entries.

Try using https://api.npolar.no/placename/?q=&filter-status=official&limit=500&start=0&format=json&fields=name,area,id,latitude,longitude,updated&sort=-updated

Comments:

mrthomassen commented 6 years ago

Thank for your answer.

We ended up using id for sorting-parameter, that was a safe path to avoid duplictes.

You should look into the links in the API. When switching to https, next-links and so on still use http.