codeforsanjose / OSM-SouthBay

Making the best possible map of San José and the South Bay
https://www.openstreetmap.org/#map=12/37.3358/-121.8906
MIT License
12 stars 3 forks source link

Incorporate Santa Clara County social distancing protocol business database #23

Open 1ec5 opened 3 years ago

1ec5 commented 3 years ago

We should incorporate Santa Clara County Social Distancing Protocol data into a community asset map and ultimately into the larger OpenStreetMap database.

Background

Since the COVID-19 pandemic began, most point of interest data in OSM in the South Bay has been at risk of going stale due to temporary or permanent closures or changes in opening hours or services. In #21, we attempted to put together a spreadsheet of open businesses based on business association listings, but this listing is skewed toward certain kinds of businesses, and the copyright situation is unclear (or at least not clear enough to rely on in OSM).

The Santa Clara County Public Health Department has created a listing of businesses and institutions that have submitted social distancing protocols for approval. At the time of writing, the listing includes 29,324 establishments. These are the businesses and institutions most likely to be open during the COVID-19 pandemic.

Unfortunately, the county hasn’t published a structured dataset corresponding to this listing. Moreover, the listing is geared towards checking for compliance and isn’t particularly usable by consumers as a business directory: it allows searching by business name and city or filtering by category, but there’s no way to limit search results by proximity or get directions.

Rationale

The 2020 National Day of Civic Hacking included a call for community asset mapping. We brainstormed several ideas before settling on the social distancing protocol listing as something that would make a government dataset significantly more accessible to the general public while avoiding overlap with projects such as Bay Area Community Resources.

The short-term goal is to process the listings into a mappable format and displaying the data directly on an asset map. People need to know which nearby businesses they can safely patronize and which brick-and-mortar community services are currently available.

The long-term goal is to add these businesses and institutions to OSM along with some COVID-19-specific tagging. This would help to jumpstart OSM’s local efforts to update POIs post-lockdown. It would also enable projects such as Bay Area Community Resources to use OSM as one source for POI data or at least have more confidence in OSM as its basemap. Both projects would make this data more accessible and usable to the general public than the current listing.

Implementation details

We expect this listing to grow significantly over time, so it’s important to take an automated, repeatable approach.

The social distancing protocol site provides only unstructured, inconsistently formatted addresses, so we’ll need to use a geocoder to convert the addresses to coordinates to make them mappable. An open-source geocoder would be preferable to a proprietary one, because we expect this data to eventually go into OSM. The import in #4 adds addresses but only in San José, whereas the county data is countywide. So we’ll need to use the county master address file. We only need to set up the geocoder on a local machine for one-off batch geocoding tasks, but eventually we may want to set up something on a server for future projects.

The site also links each business to an electronically completed PDF for details about its social distancing protocol. It’s feasible but inconvenient to scrape these PDFs, so we’re going to ignore them for now. Unfortunately, it means we won’t be able to automatically clarify the businesses in the “Other” category.

When it comes time to add the businesses to OSM, we could set up a MapRoulette challenge that asks the mapper to identify the shop inside the building using aerial and street-level imagery. We won’t want to blindly add every result en masse, because we’re concerned that some of the listings may be home-based businesses – identifying signage will be key.

Tasks

To make the asset map:

To get the data into OSM:

Additional notes

This brainstorming document turned up several other datasets worth scraping and getting into Bay Area Community Resources or OSM.

1ec5 commented 3 years ago

Pelias customized to import Santa Clara County addresses courtesy of @impiaaa: https://github.com/codeforsanjose/pelias-project-scc/

1ec5 commented 3 years ago

Scraper and scraped data courtesy of @stgibson: https://github.com/stgibson/social_distance_web_scraping/

1ec5 commented 3 years ago

Scraped data geocoded by Pelias: socialdistance.geojson.zip

county_dots downtown_dots
Thumbnails county_dots_thumb downtown_dots_thumb
1ec5 commented 3 years ago

At tonight’s hack night, @impiaaa, Kevin, and I discussed next steps for this project. Having scraped and geocoded the data once, we need to massage it and figure out the logistics of entering it into OSM. Some discussion points from tonight:

Contact information and geocoding

Scraping

Tagging

Mapping

1ec5 commented 3 years ago

Ideally, mappers would use street-level imagery to avoid mapping businesses at homes that lack business signage. However, Bing Streetside imagery is too old, while Mapillary and OpenStreetCam don’t have comprehensive coverage throughout the county. So this is an unresolved problem for now.

At last night’s hack night, @impiaaa and I focused on possible solutions to this problem, as well as the related problem of pinpointing a business within a strip mall or professional center:

At a glance, Mapillary coverage in the South Bay doesn’t look as bad as we had presumed, considering that most of the businesses would be in business districts or along arterial streets rather than in residential areas. But we do need more thorough coverage of office parks. Some areas like Milpitas, Berryessa, and South San José also have very little coverage.

Some next steps:

1ec5 commented 3 years ago

This fork of the scraper has continuing work including some tweaks to work better with the geocoder.

impiaaa commented 3 years ago
1ec5 commented 3 years ago
1ec5 commented 3 years ago

Tier 3 revision

CDPH moved Santa Clara County to Tier 3 (Orange, Moderate) on October 13. The county public health department issued a revised order that required every business to complete a revised social distancing protocol form within 14 days. The revised form looks very similar to the previous revision from September.

The SDP business database last updated on October 12. The COVID19Prepared.org site took down its link to the business database around that time, leaving this note:

Customers and the general public are encouraged to view the list of businesses that have submitted their Revised Social Distancing Protocol to help ensure our community is prepared to operate safely. A listing of businesses that have completed their Revised Social Distancing Protocols is coming soon.

Assuming the SDP site does start updating again soon, it doesn’t make sense to go forward with the current business listing. If for some reason the site doesn’t start updating, we may need to get in touch with TSS to ask for access to the raw dataset.

In the meantime, this delay gives us time to take care of other remaining tasks:

Wrangling the “Other” category

We expect the current database to overlap considerably with the revised database, but there will probably be businesses that spell their names, addresses, or “Other” description slightly differently from one form to another.

Unless the SDP site starts listing the business type description in plain text, we’ll need to scrape the linked PDFs for that information. Over 5,700 entries may be too many to tag by hand, especially if we need to keep tagging more by hand as the database updates. One possible solution might involve training a Bayesian classifier on the text, labeling them with presets.

1ec5 commented 3 years ago

Assuming the SDP site does start updating again soon, it doesn’t make sense to go forward with the current business listing. If for some reason the site doesn’t start updating, we may need to get in touch with TSS to ask for access to the raw dataset.

The SDP site is updating again, with the latest entries from November 5. The site currently lists 21,536 entries across the same categories as before. It no longer includes submissions of the previous revision of the form.

1ec5 commented 3 years ago

Formal import proposal: https://wiki.openstreetmap.org/wiki/Santa_Clara_County,_California/Social_distancing_protocol_import

1ec5 commented 3 years ago

@impiaaa, Lindsay, and I met tonight to discuss the state of the project:

Tier 2?

There are rumors that Santa Clara County may soon move back to Tier 2 (red), as other nearby counties have, just a few weeks after moving to Tier 3 (orange) and right after the SDP website got back up and running. Tier 2 allows only essential businesses to stay open, so we’re unsure what that means for the SDP database: will they keep collecting SDPs from nonessential businesses in anticipation of an eventual transition back to Tier 3, remove submissions from nonessential businesses, or stop updating or advertising the site? I’m hopeful we won’t completely repeat the database reset from last month, because the county hasn’t issued a new public health order ahead of any tier change like last time, and the revised SDP form seems to be tier-agnostic. (It no longer asks the submitter for any hard numbers around capacity.)

Given this uncertainty, we could replace opening_hours:covid19=open with a less direct tag like opening_hours:covid19:conditional=open @ (cdph:tier=3) that would be more resilient to tier changes over the next several months. But that would be complicated if mappers have to copy the suggested tags from MapRoulette instructions. We should avoid making mappers add raw tags in iD if possible, because that can be error-prone, time-consuming, and unfriendly to new mappers.

More likely, we’d avoid making any representations about a business’s opening status. After all, OSM probably already has POIs that haven’t opened since the initial stay-at-home order began. It means we wouldn’t be able to facilitate a COVID-19-specific application, but we’d still accomplish the larger goal of jumpstarting OSM’s POI coverage in the area.

Tooling

The main downside to MapRoulette is that it doesn’t prepopulate the point feature and its tags in iD, since this import requires much more manual intervention than a collaborative mapping challenge. We can make sure iD opens up to maximum zoom level 19, which is good enough to easily distinguish standalone businesses, but it would be pretty ambiguous in a strip mall or downtown area.

RapiD could be a good alternative to MapRoulette for our use case, as long as there’s a way for the user to not only accept a feature but also change its feature type and change its tags before saving. Ideally we could get permission to add our own challenge to this tasking manager instance, then periodically upload GeoJSON data from the SDP database. Otherwise, we’d have to reach out to the RapiD team about loading our data. There is a new Esri ArcGIS integration, but it would be rather indirect for us since our dataset isn’t in ArcGIS to begin with.

Proposal process

We aren’t sure about the county’s status come Tuesday and are still deciding between MapRoulette and RapiD, so we need to wait until at least early next week before posting a request for comments about this import proposal on the imports mailing list.

The proposal needs a few tweaks:

The request for comments needs to emphasize that this is a labor-intensive organized editing project originating from an external dataset, not a conventional automated import, but we’re going to adhere to some of the import guidelines anyways as a courtesy. We won’t ask participants to use dedicated import accounts, because that overhead would discourage participation while not really making the mapped features easier to identify and roll back.

Time and people

The other day, I did some back-of-the-napkin math to estimate how long this import would take:

There are currently 21,536 items on the SDP site, apparently rising daily. Of these entries, 18,008 are in taggable categories (that is, not “Other”), and an unknown number lack a physical address. I’m going to assume all 18k have physical addresses, which can account for growth over the next few weeks. If we manage to attract 10 participants mapping for 10 hours a week (2 hours every weekday) and get the average time to map a business down to 1 minute, we can finish importing what's on the site so far in 3 weeks.

That’s the optimistic scenario.

That “Other” category is 16% of the database and we’ll need to come up w/ a wide variety of tags for the businesses in there. I mean, I’m tempted to write a bot that spams the tagging list every morning w/ a business-of-the-day post. https://saesdp.sccgov.org/sdpdocs/2848313-SocialDistancingProtocolForm.pdf is “ADMINISTRATIVE OFFICE FOR WHOLESALE DISTRIBUTOR OF MOTORCYCLE PARTS”. :dizzy_face:

If we get 10 people to dedicate themselves to nothing but tagging decisions, we could take care of those 3,528 “Other” businesses in 6 hours.

To get the average time per task down to a minute, we can encourage mappers to only map the businesses as point features and not areas. As much as possible, we’re trying to avoid making mappers trawl through street-level imagery, but it might occasionally be necessary to choose the right unit in a strip mall or avoid mapping a home office. Focusing each challenge on a single category and providing crisp instructions will go a long way too.

To get the necessary level of participation, we’ll recruit mappers among Code for San José volunteers who haven’t been attending the OSM map nights. We’ll also recruit among the broader OSM community. As far as I can tell, this import will be just the third POI import in the U.S., after the nationwide GNIS import and a POI import in Puerto Rico. I’m hopeful that the import’s novelty will attract non-local mappers who wouldn’t be interested in a run-of-the-mill building import.

I had originally calculated the required time thinking that we’d try to complete the import before the county leaves Tier 3 and the SDP database gets reset again. But the possibility of going back to Tier 2 so soon changes the calculus: if we don’t map anything time-sensitive like opening_hours:covid19 and capacity, then it doesn’t matter what tier we’re in.

1ec5 commented 3 years ago

The county moved to Tier 1 (purple) today. This poster explains the impact on social distancing protocols:

Social Distancing Protocol requirements: All businesses must complete and submit a Revised Social Distancing Protocol for each of their facilities on the County’s website at COVID19Prepared.org. Social Distancing Protocols submitted prior to October 11, 2020 are no longer valid. The Revised Social Distancing Protocols must be filled out using an updated template for the Social Distancing Protocol at COVID19Prepared.org.

SDPs prior to October 11 have already been removed from the SDP site. This wording makes it sound unlikely that the SDP site would be taken offline, but it means today is probably the high water mark for the site in terms of new submissions.

1ec5 commented 3 years ago

@impiaaa and I found a reliable way to grab the “Other, please specify” business type description from each PDF’s headers:

$ curl -sI https://saesdp.sccgov.org/sdpdocs/2841699-SocialDistancingProtocolForm.pdf | grep 'x-ms-meta-typeofbusinessother' | sed 's/^.*: //' | atob
Nail supply

This could save us the trouble and time of downloading the whole PDF for the “Other, please specify” category. However, we were also looking to have mappers consult the “Facility/Worksite visited by public” checkbox in the PDF to avoid mapping businesses that aren’t open to the public. It is possible to extract this information from the PDF automatically, but to avoid excessive requests and processing time, perhaps we could limit it to certain categories we’re particularly concerned about (like professional services, but not restaurants).

1ec5 commented 3 years ago

challenge_geojson.zip as of November 16 Business type descriptions as of November 16

1ec5 commented 3 years ago

Some outstanding tasks, in no particular order:

The more I spot-check the SDPs we’ve downloaded, the less confidence I have in the “Facility/Worksite visited by public” checkbox. Even if it’s accurate, there are plenty of cases where “No” is an appropriate response for a non-retail site that nonetheless should be mapped. At most, it would be just one signal alongside the reference zoning polygons, but that makes parsing the downloaded PDFs a lower priority.

1ec5 commented 3 years ago

I sent a request for comments to the talk-us-sfbay, imports-us, and imports mailing lists. (It’s probably stuck in the imports list’s moderation queue.) I also mentioned the request for comments in the #imports channel of OSMUS Slack. We can continue to refine the proposal on the wiki in the coming days based on feedback that we receive. I’m hoping we can move forward in about a week’s time, in time to do some armchair mapping over the Thanksgiving weekend. Thanks to @impiaaa and Lindsay for workshopping the request for comments this evening.

1ec5 commented 3 years ago

The MapRoulette project is now live with an initial batch of 49 challenges. Challenges with 500 or more tasks are hidden for now until we get a chance to see how smoothly we can get through the smaller challenges.

mr_task

Thumbnails ![mr_task_thumb](https://user-images.githubusercontent.com/1231218/101421115-7f84c180-38a8-11eb-8e26-924b797f98fe.jpg)
1ec5 commented 3 years ago

Wednesday night, @frhino invited me to present the import at Code for San Francisco’s general hack night meeting. CfSF has been spearheading the Bay Area Brigades’ COVID-19 pandemic dashboard project. This import can complement the dashboard as another area for cross-bay collaboration.

Josh graciously offered to pair on the MapRoulette workflow before sharing it with the rest of the brigade. Unluckily, we ran into the Recreation challenge, which turns out to be mostly composed of nondescript offices of recreation organizations. I’ve changed that challenge’s difficulty level to Expert to steer new mappers away from it.

1ec5 commented 3 years ago

Last night, @impiaaa, Kevin, Lindsay, and I met to take stock of the import a week into it:

Promotion

With help from some friends and acquaintances, we’ve been spreading the word about the import in various places, including but not limited to:

As time goes on, we’ll have to keep being creative and possible revisit some of these communication channels to keep up the momentum.

Progress

For the first week of the import, we had enabled only the 49 smaller challenges in case any adverse feedback came through the mailing lists. Measuring progress is a bit tricky because MapRoulette normally excludes both completed and undiscoverable challenges, so it was showing the project 5% completed. Including both completed and undiscoverable challenges, we were at a little over 200 of 17,441 tasks, or 1%.

Even several days after weeklyOSM mentioned the import proposal, no feedback came in, so we’re more or less in the clear as far as the import guidelines are concerned. After the meeting, we enabled the remaining challenges except for the Construction challenge. That brings our progress back down to 1%, but it’s more accurate that way, and hopefully people will find the new categories like Restaurant and Retail to be more interesting to map.

Hiccups

The Construction challenge remains undiscoverable, because most of the submissions in that challenge appear to be minor work sites (like reconfiguring interior walls at an office building), not the sort of thing we’d map as construction in OSM.

Lindsay got unlucky working on the Grocery Stores and Pharmacy challenges due to poor geocoding or inadequate street-level imagery resolution. We ended up changing the difficulty level of the Grocery Stores challenge to Expert due to the prevalence of these issues. The Pharmacy challenge was already well on its way to completion, so Lindsay finished the job, other than a couple extra-tough cases.

@impiaaa and I differ on what to do about businesses in strip malls or office buildings, where it isn’t immediately feasible to determine which corner of the building the business occupies. We could either mark such businesses as Too Hard for now and wait to survey them in person, or we could place a point randomly within the building, perhaps with a fixme tag to indicate an approximate location. We’ll have more concrete cases to consider as people dive into the newly discoverable challenges, but it’s possible that our approach could depend on the situation: Too Hard for a strip mall with per-store entrances but a random point in the building for an office building with a central entrance.

Time management

MapRoulette currently reports an average time per task of 6 minutes, 14 seconds. That’s far, far above the back-of-the-napkin assumptions in https://github.com/codeforsanjose/OSM-SouthBay/issues/23#issuecomment-726568118. However, this metric includes situations where a mapper has gotten carried away doing legitimate mapping around the POI, as well as when a mapper forgets to unlock a task after getting distracted by something else. The average has been trending down, so it also probably reflects some initial feeling-around as we got used to the workflow. We’ll keep an eye on the metric, but the most important thing at this point is to bring more contributors into the project.

1ec5 commented 3 years ago

On Friday, we figured out why many of the addresses got geocoded way out in Sacramento County or San Benito County (example: the Pelias instance was getting confused by Santa Clara (city) and Santa Clara County sharing the same name. It’s similarly very difficult to search for addresses along El Camino Real in Santa Clara (city) in Nominatim. @impiaaa fixed the issue in Pelias by renaming the county from Santa Clara to Santa Clara County in the Who’s On First file and loading OpenAddresses.

On Saturday, @impiaaa rescraped the site and reuploaded all the tasks. We’re up to 23,004 tasks total.

1ec5 commented 3 years ago

We don’t want mappers to map home offices and the like. That can be difficult to tell from the protocol form, but we’re hoping that the “No physical address” designation is a good determiner. … The more I spot-check the SDPs we’ve downloaded, the less confidence I have in the “Facility/Worksite visited by public” checkbox. Even if it’s accurate, there are plenty of cases where “No” is an appropriate response for a non-retail site that nonetheless should be mapped. At most, it would be just one signal alongside the reference zoning polygons, but that makes parsing the downloaded PDFs a lower priority.

The “visited by public” checkbox sometimes helps, but it’s pretty unreliable because business owners are also unclear on its meaning. We’ve only mapped about 2% of the SDPs so far, but we’ve already encountered plenty of cases that have forced us to consider the privacy of private residences:

I think our decisions so far are roughly in line with the OSM community consensus as expressed by this summary. Protecting privacy is important to us, as is on-the-ground verifiability to some extent. When in doubt, we’ve deferred the task for later review. Depending on the circumstances, we may want to contact some of these businesses to determine their expectations around being listed.

1ec5 commented 3 years ago

As of December 21, we reached 4% across all challenges, including 43% of high-priority tasks, 7% of medium-priority tasks, and 2% of low-priority tasks:

Screenshot-2020-12-21 MapRoulette

There have been cases where both the SDP and sign outside the building had the wrong address.

On December 24, I added a section to the detailed instructions document explaining how to configure iD to show the Santa Clara County parcel layer as a background layer to more easily associate addresses in SDPs with buildings in OSM:

https://webgis.sccgov.org/gis/rest/services/property/SCCProperty2/MapServer/export?bbox={bbox}&bboxSR={proj}&size={width},{height}&format=png&transparent=false&f=image

We finished half the Religious Institutions challenge by December 27 and finished the nursing home challenge on January 2 (thanks Will!). Camille and @sutter-dave joined us on January 7 to help with the POI import and introduce us to Apogee as a possible tool for future imports.

As of January 12, we finished two-thirds of the high-priority tasks, enough for the time series chart to show some movement:

Screenshot-2021-1-12 MapRoulette

We finished half of the laundromat/dry cleaning challenge by January 15. Unfortunately, around this time we discovered that a MapRoulette user unfamiliar with OSM editing had begun completing tasks completely incorrectly; their edits had to be reverted and 12 tasks reset in the banks challenge.

On January 22, @impiaaa reran the scraper, pulling in lots of new tasks that set our completion rate back to less than 4%. On the bright side, the update brought in improvements to geocoding, due in part to the new addresses we’ve been adding as part of the POI import. Additionally, the priorities have changed so that outlying, typically poorly geocoded tasks no longer stubbornly show up any time you try to get a random task.

We refinished the pharmacies challenge on January 25 and got gas stations back up to halfway on January 31. As of February 3, we’re about 6% complete, having fully recovered from the latest update from the SDP website.

1ec5 commented 3 years ago

This import is one of the more extensive projects on the MapRoulette platform. The site has been serving us well, but certain things like gathering statistics do take a bit longer, understandable considering the large number of challenges and the sheer size of some of those challenges.

Unfortunately, MapRoulette has been experiencing performance problems and the team is considering making some changes that will adversely impact the import. osmlab/maproulette3#1536 would limit the number of challenges per project, and osmlab/maproulette3#1535 would limit the number of tasks per challenge. Taken together, these changes would force us to split the import project into several projects, possibly arbitrarily, making it more difficult for us to gauge our progress, attract and onboard new mappers, ensure equitable coverage throughout the county, and manage synchronization with the SDP database.

If these changes go into effect as planned, we may need to consider an alternative platform for the import. We don’t have great options. to-fix is unmaintained, Sophox Editor is offline, the OSMUS Tasking Manager is ill-suited to microtasking, and RapiD only integrates with ArcGIS services (which would make rescrapes impractical).

If we stick with MapRoulette, adhering to the new caps would mean splitting apart the larger challenges like Retail and Other into dozens of challenges. How would we split the challenges? If we split them by ZIP code or city, certain areas will inevitably enjoy more attention than others. But anything more arbitrary would prevent us from consolidating tasks into bulk changesets.

mvexel commented 3 years ago

The caps are still being discussed (see linked tickets above) and having community input like yours is very valuable to us. We do need to strike a balance between performance and flexibility, and are trying to determine what that right balance is.

I would hate for y'all to move away from MapRoulette because of this, if you find the platform otherwise useful. I'll have a chat with @1ec5 to learn more about the way you use MapRoulette.

1ec5 commented 3 years ago

Thanks so much for reaching out, @mvexel! MapRoulette has been key to this import project – https://github.com/codeforsanjose/OSM-SouthBay/issues/23#issuecomment-774246371 shows that there’s really no alternative that matches MapRoulette in ease of use when the data source can update dynamically. From the looks of it, any new limit to the number of tasks per challenge or the number of challenges per project would comfortably accommodate this import’s project, so we should be in good shape.

1ec5 commented 3 years ago

Lots more happened since the last time I updated this issue:

Timeline

The laundry/dry cleaner challenge returned to 50% on February 7.

By February 15, we reached 7% overall:

2021-02-15

On February 24, we retagged all the Fry’s locations in the county, including all the locations that had filed SDPs, as shop=vacant disused:shop=electronics after the chain closed.

On February 25, @impiaaa overtook me on the leaderboard to claim first place:

leaderboard

On March 1, we completed the maintenance services challenge.

On March 3, the county moved back to Tier 2 (red).

On March 4, we reran the scraper incrementally. We remained at 7%:

2021-03-04

The childcare challenge reached 50% complete on March 17. As seen here on March 15, when we had reached 40% complete and 25% fixed, childcare and kindergarten facilities are much more evenly distributed throughout the county compared to before:

childcare before childcare 2021-03-15

As of today, we’ve completed 10% of the entire import:

2021-03-24

Also, @impiaaa and I submitted a joint talk proposal for State of the Map 2021 about this import. 🤞

Address bloopers

Some examples we’ve seen of SDP addresses that threw off the geocoder:

Address in SDP Actual address Distance
2650A Walsh Ave, Santa Clara, CA 95051 2350 Walsh Avenue Unit A, Santa Clara, CA 95051 584.9 ft
6477 Almaden Rd., San Jose CA 95120 6477 Almaden Expressway, San Jose, CA, 95120 0.789 mi
470 Jackson Ave., San Jose, CA 95112 470 Jackson Street, San Jose, CA 95112 2.265 mi
2302 MONTEREY RD, San Jose, CA 95111 5302 Monterey Road, San Jose, CA 95111 4.047 mi
6477 Almaden Rd, San Jose, CA 95120 6477 Almaden Expressway, San Jose, CA 95120 7.006 mi
2133 Morrill, San Jose, CA 95132 2133 Morrill Avenue, San Jose, CA 95132 9.194 mi
4849 San Felipe Rd. Unit 140, San Jose, CA 95135 4898 San Felipe Road Unit 140, San Jose, CA 95135 19.51 mi
4134 Fairway Dr., Sequel, CA 95073 4134 Fairway Drive, Soquel, CA 95073 148.2 mi

Tips for mappers

Further afield

1ec5 commented 3 years ago

We have a new 3rd-place mapper:

Screenshot-2021-3-26 MapRoulette

1ec5 commented 3 years ago

Based on all the buildings and POIs we’ve been importing, I’d imagine StreetComplete is filling up with many more challenges in building- and POI-related quests these days. It must be a lot more fun to use now that not all of the quests ask you for the surface of an obviously paved street. It’ll be interesting to check back a couple months from now and see if there’s been an uptick in edits made using StreetComplete. This OSMCha filter kinda-sorta tracks such edits, though the county boundary would need to be refined quite a bit.

So far, it looks like we’ve gotten improvements to names, addresses, and opening hours from StreetComplete users. StreetComplete doesn’t ask about some things that are often missing from the SDPs we’re importing, such as cuisine (streetcomplete/StreetComplete#103), medical specialty (streetcomplete/StreetComplete#1020), and religious denomination (streetcomplete/StreetComplete#1737).

1ec5 commented 3 years ago
1ec5 commented 3 years ago

Task completion milestones:

Progress on May 28

Other notable events:

1ec5 commented 3 years ago

OSM POI coverage compared to SCCPHD and Census Bureau data by tract and ZIP code

OpenStreetMap POI coverage in Santa Clara County by tract OpenStreetMap POI coverage in Santa Clara County by ZIP code

1ec5 commented 3 years ago

We have a new 3rd-place mapper!

leaderboard

As of June 10: 15% complete overall, including 30% of high-priority tasks

Current progress:

progress

Some tidbits:

1ec5 commented 3 years ago

I submitted a poster to the State of the Map 2021 poster competition:

Mapping POIs in Santa Clara County.pdf

1ec5 commented 11 months ago

OSM POI coverage compared to SCCPHD and Census Bureau data by tract and ZIP code

I’ve uploaded some of the files I used to create this report to the osm-southbay-poi-coverage repository.

1ec5 commented 11 months ago

An updated report as of August 5, 2023:

OpenStreetMap POI coverage in Santa Clara County August 2023.pdf

POIs by census tract versus population density

POIs by census tract versus median household income

POIs by census tract versus share of nonwhite or Hispanic/Latino residents

POIs by ZIP code