openvenues / openvenues.io

openvenues website
MIT License
1 stars 2 forks source link

Where's the site? #14

Closed todrobbins closed 7 years ago

todrobbins commented 7 years ago

@thatdatabaseguy is the site offline?

I'm trying to figure out where openvenues fits into the Mapzen ecosystem/community, especially in relation to WOF. Cheers!

albarrentine commented 7 years ago

@todrobbins I don't actually work for Mapzen. OpenVenues is a joint project we started a while back to extract places from the Common Crawl. The site was never launched because batch deduping venues, which would be required to keep the index up-to-date, turned out to be a harder problem than there was time/money to solve. Libpostal is one of the results of trying to solve it.

As far as WoF, those venues are primarily from a single static data source from 2011 (SimpleGeo), and expanding them any further runs into the same issue. Just this week I've started a new contract with the WoF team around deduping places, and it may be possible to revisit the Common Crawl venues again at its conclusion, but can't make any promises.

This repo can be safely ignored for the time being.

todrobbins commented 7 years ago

Thanks for the update! Forgive my ignorance, but what is the Common Crawl?

albarrentine commented 7 years ago

CommonCrawl is a public web crawl, a monthly(ish) copy of virtually every page on the Internet, similar to what Google has, but available to everyone. The OpenVenues jobs parse billions of web pages to look for microdata, vcards/hcards, map embeds, etc. Anything that might be a place.

todrobbins commented 7 years ago

Thanks again!