wvanderp / isOsmComplete

This site tracks how complete OpenStreetMap really is. It compares the number of features in the OpenStreetMap to the number of features in the official data sources.
https://wvanderp.github.io/isOsmComplete/
MIT License
4 stars 2 forks source link

French section with the number of bakeries? ๐Ÿž๐Ÿฅ๐Ÿฅ– #1

Closed Binnette closed 10 months ago

Binnette commented 10 months ago

Hi @wvanderp, your website is a really great and a funny idea ๐Ÿ˜

I just submitted it to https://weeklyosm.eu in the section "Did you know?"

The [website] that tries to verify that the OSM map is complete? Of course it is not possible, but this website gives some metrics on some large store brands mapped and expected. But also on the number of museums, sculptures, airports, etc.

If you have some spare time, I would love to have the difference between the number of mapped and expected bakeries in France. I am French, and of course, I want to exploit the funny aspect of your website and the clichรฉ of French loving bread. ๐Ÿ˜Š

Binnette commented 10 months ago

Hello again @wvanderp, I worked a little bit on this issue.

I downloaded the business registry from the French government website. It has been updated on October 1st, 2023.

I coded a little python script to filter this 3.4GB file:

My script counted 43,232 bakeries.

Then I ran this Overpass Query:

[out:json];
area[name="France"]->.a;
nwr["shop"="bakery"](area.a);
out geom;

This query returned 28,174 bakeries.

So the current progress is around: 65.2% Not bad!

I would really appreciate it if you consider adding this stat on your website ๐Ÿฅ–๐Ÿฅ–๐Ÿฅ–๐Ÿฅ–๐Ÿฅ–

Have a nice day.

Binnette commented 10 months ago

I just saw that your website is fully automated with GitHub action. So download a 2.4GB file on GitHub platform and make it parse the file is not a solution. So I found a workaround by using French government API. But my script had to make 1,820 calls! I didn't find a better way to do it with the available API. My script took around 5 minutes on my PC.

Also, in my previous message, I realize that I was counting companies and not selling points. Usually companies own several bakeries. So the real number of bakeries as we should map in OSM is: 56,553.

My python script:

import urllib.request
import json

regions = ['01','02','03','04','06','11','24','27','28','32','44','52','53','75','76','84','93','94']
base_url = 'https://recherche-entreprises.api.gouv.fr/search?activite_principale=10.71C&etat_administratif=A&minimal=true&per_page=25'
total_bakeries = 0

def get_stats(region):
    url = base_url + '&region=' + region
    with urllib.request.urlopen(url) as json_response:
        data = json.loads(json_response.read().decode())
        return {
            'pages': data['total_pages'],
            'results': data['total_results']
        }

def get_nb_bakeries(region, page):
    url = base_url + '&region=' + region + '&page=' + str(page)
    bakeries = 0
    with urllib.request.urlopen(url) as json_response:
        data = json.loads(json_response.read().decode())
        for d in data["results"]:
            bakeries = bakeries + d["nombre_etablissements_ouverts"]
    return bakeries

for r in regions:
    stats = get_stats(r)
    print(f'Process region {r}: {stats["pages"]} pages and {stats["results"]} companies to parse. Bakeries found so far: {total_bakeries}')
    for p in range(1, stats['pages']):
        total_bakeries = total_bakeries + get_nb_bakeries(r, p)

print(f'Total of bakeries: {total_bakeries}')

I had to split my API calls by regions because otherwise I reach the limit of 10000 results by filter set.

So the progress is: 49.8%

Binnette commented 10 months ago

Excellent! I โค๏ธ it ๐Ÿ˜„ TY @wvanderp

wvanderp commented 10 months ago

I want to thank you so much.

When I put this website online, I made it mainly for myself because the internet is too large for somebody to stumble onto my random website randomly.

And then not only did somebody see my website, but they also did a lot of work to contribute something awesome to it.

Also, I want to thank you for submitting the site to weeklyOSM. I always think that my toy projects are not worthy of this kind of attention. But today, some other people also took their time and contributed to the project. This should prove to me that it is worthy.

Thank you again for proving that I'm not just pushing things to the internet that nobody will see. And that this is actually interesting to people