A Python library for accessing SafeGraph data through SafeGraph's GraphQL API.
Please see the SafeGraph API documentation for further information on GraphQL, available datasets, query types, use cases, and FAQs.
Please file issues on this repository for bugs or feature requests specific to this Python client. For bugs or feature requests related to the SafeGraph API itself, please contact [...]
Core Places: Base information such as location name, category, and brand association for points of interest (POIs) where consumers spend time or business operations take place. Available for ~9.9MM POI including permanently closed POIs.
Geometry: POI footprints with spatial hierarchy metadata depicting when child polygons are contained by parents or when two tenants share the same polygon. Available for ~9.2MM POI (Geometry metadata not provided for closed POIs).
Patterns: Place, traffic, and demographic aggregations that answer: how often people visit, how long they stay, where they came from, where else they go, and more. Available for ~4.5MM POI in weekly and monthly versions. Historical data dating back to January 2018 is available via the API for the weekly version of Patterns only. For the monthly version of Patterns, only the most recent month is available via the API.
pip install safegraphQL
Get an API key from the SafeGraph Shop and instantiate the client.
from safegraphql import client
sgql_client = client.HTTP_Client("MY_API_KEY")
Use the sgql_client
object to make requests!
By default, query functions in safegraphQL return pandas DataFrames. See the return_type
parameter below for how to return a JSON response object instead.
lookup()
Query all Core Places columns for a single Placekey.
pk = 'zzw-222@8fy-fjg-b8v' # Disney World
core = sgql_client.lookup(product = 'core', placekeys = pk, columns = '*')
core
placekey | parent_placekey | location_name | safegraph_brand_ids | brands | top_category | sub_category | naics_code | latitude | longitude | street_address | city | region | postal_code | iso_country_code | phone_number | open_hours | category_tags | opened_on | closed_on | tracking_closed_since | geometry_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 222-223@65y-rxx-djv | 224-225@65y-rxx-dgk | Walmart Supercenter | ['SG_BRAND_04a8ca7bf49e7ecb4a32451676e929f0'] | [{'brand_id': 'SG_BRAND_04a8ca7bf49e7ecb4a32451676e929f0', 'brand_name': 'Walmart Supercenter Canada'}] | General Merchandise Stores, including Warehouse Clubs and Supercenters | All Other General Merchandise Stores | 452319 | 42.6947 | -73.847 | 141 Washington Avenue Ext | Albany | NY | 12205 | US | { "Mon": [["5:00", "23:00"]], "Tue": [["5:00", "23:00"]], "Wed": [["5:00", "23:00"]], "Thu": [["5:00", "23:00"]], "Fri": [["5:00", "23:00"]], "Sat": [["5:00", "23:00"]], "Sun": [["5:00", "23:00"]] } | [] | 2019-07-01 | POLYGON |
You can do the same for Geometry and Monthly Patterns.
geo = sgql_client.lookup(product = 'geometry', placekeys = pk, columns = '*')
patterns = sgql_client.lookup(product = 'monthly_patterns', placekeys = pk, columns = '*')
Query the most recent Weekly Patterns data
watterns = sgql_client.lookup(product = 'weekly_patterns', placekeys = pk, columns = '*')
Query an arbitrary set of columns from a dataset.
# requested columns must all come from the same dataset
cols = [
'placekey',
'location_name',
'street_address',
'city',
'region',
'brands',
'top_category',
'sub_category',
'naics_code'
]
sgql_client.lookup(product = 'core', placekeys = pk, columns = cols)
placekey | location_name | brands | top_category | sub_category | naics_code | street_address | city | region | |
---|---|---|---|---|---|---|---|---|---|
0 | 222-223@65y-rxx-djv | Walmart Supercenter | [{'brand_id': 'SG_BRAND_04a8ca7bf49e7ecb4a32451676e929f0', 'brand_name': 'Walmart Supercenter Canada'}] | General Merchandise Stores, including Warehouse Clubs and Supercenters | All Other General Merchandise Stores | 452319 | 141 Washington Avenue Ext | Albany | NY |
You can perform any of the previous queries on a set of multiple Placekeys.
pks = [
'zzw-222@8fy-fjg-b8v', # Disney World
'zzw-222@5z6-3h9-tsq' # LAX
]
sgql_client.lookup(
product = 'core',
placekeys = pks,
columns = cols
)
placekey | location_name | brands | top_category | sub_category | naics_code | street_address | city | region | |
---|---|---|---|---|---|---|---|---|---|
0 | zzw-222@5z6-3h9-tsq | Los Angeles International Airport | [] | Support Activities for Air Transportation | Other Airport Operations | 488119 | 1 World Way | El Segundo | CA |
1 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | [] | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | Walt Disney World Resort | Orlando | FL |
return_type
By default, query functions in safegraphQL return pandas DataFrames. By setting return_type = 'list'
, you can return the JSON response object instead.
# core columns only
sgql_client.lookup(product = 'core', placekeys = pk, columns = '*', return_type = 'list')
---
[{'placekey': 'zzw-222@8fy-fjg-b8v',
'parent_placekey': None,
'location_name': 'Walt Disney World Resort',
'safegraph_brand_ids': [],
'brands': [],
'top_category': 'Amusement Parks and Arcades',
'sub_category': 'Amusement and Theme Parks',
'naics_code': 713110,
'latitude': 28.388228,
'longitude': -81.567304,
'street_address': 'Walt Disney World Resort',
'city': 'Orlando',
'region': 'FL',
'postal_code': '32830',
'iso_country_code': 'US',
'phone_number': None,
'open_hours': '{ "Mon": [["8:00", "23:00"]], "Tue": [["8:00", "23:00"]], "Wed": [["8:00", "23:00"]], "Thu": [["8:00", "23:00"]], "Fri": [["8:00", "23:00"]], "Sat": [["8:00", "23:00"]], "Sun": [["8:00", "23:00"]] }',
'category_tags': [],
'opened_on': None,
'closed_on': None,
'tracking_closed_since': '2019-07-01',
'geometry_type': 'POLYGON'}]
save()
Export the most recently queried result. If the previous result had been a pandas DataFrame, the saved file will be a .csv. If the result had been the JSON response object, the saved file will be a .json. The default path for the exported file will be results.{csv/json}
.
# saved file will be results.csv
sgql_client.lookup(product = 'core', placekeys = pk, columns = '*')
sgql_client.save()
# saved file will be results.json
sgql_client.lookup(product = 'core', placekeys = pk, columns = '*', return_type = 'list')
sgql_client.save()
# saved file will be safegraph_data.csv
sgql_client.lookup(product = 'core', placekeys = pk, columns = '*')
sgql_client.save(path = 'safegraph_data.csv')
sg_merge()
Merge safegraphQL query results with sg_merge()
.
core = sgql_client.lookup(product = 'core', placekeys = pks, columns = ['placekey', 'location_name', 'naics_code', 'top_category', 'sub_category'])
geo = sgql_client.lookup(product = 'geometry', placekeys = pks, columns = ['placekey', 'polygon_class', 'enclosed'])
merge_set = [core, geo]
merged = sgql_client.sg_merge(datasets = merge_set)
placekey | location_name | top_category | sub_category | naics_code | polygon_class | enclosed | |
---|---|---|---|---|---|---|---|
0 | zzw-222@5z6-3h9-tsq | Los Angeles International Airport | Support Activities for Air Transportation | Other Airport Operations | 488119 | OWNED_POLYGON | False |
1 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False |
!! ADD SECTION ON INNER JOIN HERE ONCE ISSUE FOR NO LONGER SHOWING NULL PATTERNS ROWS HAS BEEN RESOLVED !!
sg_merge()
works for JSON response objects as well.
core = sgql_client.lookup(product = 'core', placekeys = pks, columns = ['placekey', 'location_name', 'naics_code', 'top_category', 'sub_category'], return_type = 'list')
geo = sgql_client.lookup(product = 'geometry', placekeys = pks, columns = ['placekey', 'polygon_class', 'enclosed'], return_type = 'list')
merge_set = [core, geo]
merged = sgql_client.sg_merge(datasets = merge_set)
---
[{'placekey': 'zzw-222@5z6-3h9-tsq',
'location_name': 'Los Angeles International Airport',
'top_category': 'Support Activities for Air Transportation',
'sub_category': 'Other Airport Operations',
'naics_code': 488119,
'polygon_class': 'OWNED_POLYGON',
'enclosed': False},
{'placekey': 'zzw-222@8fy-fjg-b8v',
'location_name': 'Walt Disney World Resort',
'top_category': 'Amusement Parks and Arcades',
'sub_category': 'Amusement and Theme Parks',
'naics_code': 713110,
'polygon_class': 'OWNED_POLYGON',
'enclosed': False}]
Use lookup()
to query Weekly Patterns data for a Placekey from a particular date (YYYY-MM-DD
format).
date = '2019-06-15'
sgql_client.lookup(
product = 'weekly_patterns',
placekeys = pk,
date = date,
columns = ['placekey', 'location_name', 'date_range_start', 'date_range_end', 'raw_visit_counts']
)
placekey | location_name | date_range_start | date_range_end | raw_visit_counts | |
---|---|---|---|---|---|
0 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-06-10T00:00:00-04:00 | 2019-06-17T00:00:00-04:00 | 242530 |
Pass a list of dates to query multiple Weekly Patterns releases. Note that if two dates fall within the same release (e.g. 2019-06-15
and 2019-06-16
below), the data for the relevant week will only be returned once.
# notice the dates list contains 4 elements, but only 3 rows of data are returned
dates = ['2019-06-15', '2019-06-16', '2021-05-23', '2018-10-23']
sgql_client.lookup(
product = 'weekly_patterns',
placekeys = pk,
date = dates,
columns = ['placekey', 'location_name', 'date_range_start', 'date_range_end', 'raw_visit_counts']
)
placekey | location_name | date_range_start | date_range_end | raw_visit_counts | |
---|---|---|---|---|---|
0 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2018-10-22T00:00:00-04:00 | 2018-10-29T00:00:00-04:00 | 169884 |
1 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-06-10T00:00:00-04:00 | 2019-06-17T00:00:00-04:00 | 242530 |
2 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2021-05-17T00:00:00-04:00 | 2021-05-24T00:00:00-04:00 | 323187 |
Pass a Python dictionary with date_range_start
and date_range_end
key/value pairs to query a range of Weekly Patterns releases.
dates = {'date_range_start': '2019-04-10', 'date_range_end': '2019-06-05'}
sgql_client.lookup(
product = 'weekly_patterns',
placekeys = pk,
date = dates,
columns = ['placekey', 'location_name', 'date_range_start', 'date_range_end', 'raw_visit_counts']
)
placekey | location_name | date_range_start | date_range_end | raw_visit_counts | |
---|---|---|---|---|---|
0 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-04-15T00:00:00-04:00 | 2019-04-22T00:00:00-04:00 | 249559 |
1 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-04-22T00:00:00-04:00 | 2019-04-29T00:00:00-04:00 | 248989 |
2 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-04-29T00:00:00-04:00 | 2019-05-06T00:00:00-04:00 | 263878 |
3 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-05-06T00:00:00-04:00 | 2019-05-13T00:00:00-04:00 | 247846 |
4 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-05-13T00:00:00-04:00 | 2019-05-20T00:00:00-04:00 | 223901 |
5 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-05-20T00:00:00-04:00 | 2019-05-27T00:00:00-04:00 | 212718 |
6 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-05-27T00:00:00-04:00 | 2019-06-03T00:00:00-04:00 | 236622 |
7 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | 2019-06-03T00:00:00-04:00 | 2019-06-10T00:00:00-04:00 | 239621 |
And combine the results with Core Places and Geometry using sg_merge()
.
dates = {'date_range_start': '2019-04-10', 'date_range_end': '2019-06-05'}
watterns = sgql_client.lookup(
product = 'weekly_patterns',
placekeys = pk,
date = dates,
columns = ['placekey', 'location_name', 'date_range_start', 'date_range_end', 'raw_visit_counts']
)
core = sgql_client.lookup(product = 'core', placekeys = pk, columns = ['placekey', 'location_name', 'naics_code', 'top_category', 'sub_category'])
geo = sgql_client.lookup(product = 'geometry', placekeys = pk, columns = ['placekey', 'polygon_class', 'enclosed'])
merged = sgql_client.sg_merge(datasets = [core, geo, watterns])
placekey | location_name | top_category | sub_category | naics_code | polygon_class | enclosed | date_range_start | date_range_end | raw_visit_counts | |
---|---|---|---|---|---|---|---|---|---|---|
0 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-04-15T00:00:00-04:00 | 2019-04-22T00:00:00-04:00 | 249559 |
1 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-04-22T00:00:00-04:00 | 2019-04-29T00:00:00-04:00 | 248989 |
2 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-04-29T00:00:00-04:00 | 2019-05-06T00:00:00-04:00 | 263878 |
3 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-05-06T00:00:00-04:00 | 2019-05-13T00:00:00-04:00 | 247846 |
4 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-05-13T00:00:00-04:00 | 2019-05-20T00:00:00-04:00 | 223901 |
5 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-05-20T00:00:00-04:00 | 2019-05-27T00:00:00-04:00 | 212718 |
6 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-05-27T00:00:00-04:00 | 2019-06-03T00:00:00-04:00 | 236622 |
7 | zzw-222@8fy-fjg-b8v | Walt Disney World Resort | Amusement Parks and Arcades | Amusement and Theme Parks | 713110 | OWNED_POLYGON | False | 2019-06-03T00:00:00-04:00 | 2019-06-10T00:00:00-04:00 | 239621 |
lookup_by_name()
If you don't know a location's Placekey, you can look it up by name. Note that you should use this for looking up a particular location, but if you are searching for more than one relevant location, you should use the search()
function described below.
Note: When querying by location & address, it's necessary to have at least the following combination of fields to return a result:
location_name + street_address + city + region + iso_country_code
location_name + street_address + postal_code + iso_country_code
location_name + latitude + longitude + iso_country_code
location_name = "Taco Bell"
street_address = "710 3rd St"
city = "San Francisco"
region = "CA"
iso_country_code = "US"
sgql_client.lookup_by_name(
product = 'core',
location_name = location_name,
street_address = street_address,
city = city,
region = region,
iso_country_code = iso_country_code,
columns = ['placekey', 'location_name', 'street_address', 'city', 'region', 'postal_code', 'iso_country_code', 'latitude', 'longitude']
)
placekey | location_name | latitude | longitude | street_address | city | region | postal_code | iso_country_code | |
---|---|---|---|---|---|---|---|---|---|
0 | 224-222@5vg-7gv-d7q | Taco Bell | 37.7786 | -122.393 | 710 3rd St | San Francisco | CA | 94107 | US |
You can search for SafeGraph POI by a variety of attributes, as described here.
Search by a single criterion, such as any convenience store POI in the SafeGraph dataset (naics_code == 445120
). By default, search()
returns only the first 20 results.
naics_code = 445120
search_result = sgql_client.search(product = 'core', columns = ['placekey', 'location_name', 'street_address', 'city', 'region', 'iso_country_code'], naics_code = naics_code)
placekey | location_name | street_address | city | region | iso_country_code | |
---|---|---|---|---|---|---|
0 | zzw-223@646-9rk-nqz | Cash & Dash 7 | 701 Highway 701 N | Loris | SC | US |
1 | 222-222@63r-tqr-zj9 | 7-Eleven | 8708 Liberia Ave | Manassas | VA | US |
2 | zzy-222@4hf-pq3-w6k | Londis | 18 & 22 & 26 Winster Mews, | Gamesley | Derbyshire | GB |
3 | 222-223@63v-c97-hnq | Circle K | 1608 East Ave | Akron | OH | US |
4 | zzw-222@8dj-n5s-2hq | 7-Eleven | 13150 S US Highway 41 | Gibsonton | FL | US |
5 | 224-222@66b-2d2-rhq | Depanneur 7 Jours | 6024 Avenue De Darlington | Montreal | QC | CA |
6 | 222-222@5pc-4d2-8n5 | Kwik Trip | 1549 Madison Ave | Mankato | MN | US |
7 | 22c-222@5z5-3r9-8jv | 7-Eleven | 5000 Wilshire Blvd | Los Angeles | CA | US |
8 | 223-223@5z5-qcd-wc5 | 7-Eleven | 6401 Mission Gorge Rd | San Diego | CA | US |
9 | zzw-223@5r8-2cq-nbk | Casey's General Stores | 2604 N Range Line Rd | Joplin | MO | US |
10 | zzw-223@8gn-kc9-5mk | Circle K | 101 N Gilmer Ave | Lanett | AL | US |
11 | zzw-222@5q9-b99-vcq | Circle K | 7530 Village Square Dr | Castle Pines | CO | US |
12 | zzy-225@3x7-z8z-qj9 | Circle K | 100 Twelfth Avenue South West | Slave Lake | AB | CA |
13 | 223-223@8sx-zcv-grk | Circle K | 901 Voss Ave | Odem | TX | US |
14 | 224-222@5wb-sdq-r8v | Circle K | 5301 W Canal Dr | Kennewick | WA | US |
15 | zzw-226@64h-vj9-mrk | 21st Street Deli | 222 W 21st St Ste J | Norfolk | VA | US |
16 | 22k-222@627-wdk-z9f | Victory Meat Center | 8506 Bay Pkwy | Brooklyn | NY | US |
17 | zzy-222@5pm-6rj-4n5 | Quick Mart | 129 E Hill St | Waynesboro | TN | US |
18 | 224-222@3wz-4kr-rc5 | 7-Eleven | 1704 61st Street South East | Calgary | AB | CA |
19 | 223-222@5pb-b7m-5s5 | Casey's General Stores | 907 13th St N | Humboldt | IA | US |
Search by multiple criteria, such as Sheetz locations in Pennsylvania.
brand = 'Sheetz'
region = 'PA'
search_result = sgql_client.search(product = 'core', columns = ['placekey', 'location_name', 'street_address', 'city', 'region', 'iso_country_code'], brand = brand, region = region)
search_result.head()
placekey | location_name | street_address | city | region | iso_country_code | |
---|---|---|---|---|---|---|
0 | 225-222@63p-wtm-8qf | Sheetz | 24578 Route 35 N | Mifflintown | PA | US |
1 | 224-222@63p-d8d-dgk | Sheetz | 330 Westminster Dr | Kenmar | PA | US |
2 | 223-222@63s-x95-c89 | Sheetz | 420 N Baltimore Ave | Mount Holly Springs | PA | US |
3 | 223-222@63d-3y3-3wk | Sheetz | 4701 William Penn Hwy | Murrysville | PA | US |
4 | 227-222@63p-tv5-brk | Sheetz | 8711 Woodbury Pike | East Freedom | PA | US |
search()
works for Geometry, Monthly Patterns, and Weekly Patterns as well.
brand = 'Sheetz'
region = 'PA'
date = '2021-07-04'
search_result = sgql_client.search(product = 'weekly_patterns', columns = ['placekey', 'location_name', 'raw_visit_counts'], date = date, brand = brand, region = region)
placekey | location_name | raw_visit_counts | |
---|---|---|---|
0 | 225-222@63p-wtm-8qf | Sheetz | 338 |
1 | 224-222@63p-d8d-dgk | Sheetz | 619 |
2 | 223-222@63s-x95-c89 | Sheetz | 241 |
3 | 223-222@63d-3y3-3wk | Sheetz | 705 |
4 | 227-222@63p-tv5-brk | Sheetz | 564 |
Change the max_results
parameter to request more than the default 20 results.
brand = 'Sheetz'
region = 'PA'
max_results = 200
search_result = sgql_client.search(product = 'core', columns = ['placekey', 'location_name', 'street_address', 'city', 'region', 'iso_country_code'], brand = brand, region = region, max_results = max_results)
placekey | location_name | street_address | city | region | iso_country_code | |
---|---|---|---|---|---|---|
0 | 225-222@63p-wtm-8qf | Sheetz | 24578 Route 35 N | Mifflintown | PA | US |
1 | 222-222@63p-bjm-xnq | Sheetz | 270 Route 61 S | Schuylkill Haven | PA | US |
2 | 228-222@63t-p3s-zzz | Sheetz | 107 Franklin St | Slippery Rock | PA | US |
3 | zzw-222@63s-xr7-49z | Sheetz | 6054 Carlisle Pike | Mechanicsburg | PA | US |
4 | zzw-222@63s-9nq-9zz | Sheetz | 3200 Cape Horn Rd | Red Lion | PA | US |
... | ||||||
195 | zzw-222@63n-xgm-zpv | Sheetz | 7775 N Route 220 Hwy | Linden | PA | US |
196 | 222-222@63s-xgf-cyv | Sheetz | 5201 Simpson Ferry Rd | Mechanicsburg | PA | US |
197 | 222-222@63p-8qd-fcq | Sheetz | 1550 State Rd | Duncannon | PA | US |
198 | 226-222@63s-xqc-ty9 | Sheetz | 1720 Harrisburg Pike | Carlisle | PA | US |
199 | 227-222@63d-77y-dgk | Sheetz | 1297 Washington Pike | Bridgeville | PA | US |
Change the after_result_number
parameter if you want to skip the first few results. For example, maybe you already searched for the first 2 Sheetz results in PA, and you're interested in the results after that.
brand = 'Sheetz'
region = 'PA'
max_results = 200
after_result_number = 2
search_result = sgql_client.search(product = 'core', columns = ['placekey', 'location_name', 'street_address', 'city', 'region', 'iso_country_code'], brand = brand, region = region, max_results = max_results, after_result_number = after_result_number)