Closed AmandaDoyle closed 4 years ago
instead of overwriting category
and subcategory
, should we look at them together and derive a dcp_category
?
e.g. we can have category = resturant and bars
but subcategory
as cafe, ideally this should be categorized as resturant and cafe
, but the current recoding will convert it to resturant and bars
instead of overwriting
category
andsubcategory
, should we look at them together and derive adcp_category
? e.g. we can have category =resturant and bars
butsubcategory
as cafe, ideally this should be categorized asresturant and cafe
, but the current recoding will convert it toresturant and bars
I like this idea. Would you like to draft a proposal? Or I can work on a proposal after standup
select
distinct
(CASE
WHEN betanyc_category ~* 'retail|bike|Bicycle|store|Pet Supplies|Pharmacy'
THEN 'Dry retail'
WHEN betanyc_category ~* 'service|free food|Laundromat|Pantry|Wellness|'||
'Soup Kitchen|Health|Doctor|Deliveries'
THEN 'Services'
WHEN betanyc_category IS NULL THEN NULL
ELSE 'Food and beverage'
END) as category,
betanyc_category,
betanyc_subcategory
from betanyc.latest
order by category, betanyc_category
Looking at betanyc_subcategory:
select
distinct
(CASE
WHEN betanyc_subcategory ~* 'wine|beer|tea|market|food|bakery|bar|restaurant|grocery|cafe'
OR betanyc_category ~* 'bakery|bar|grocer|restaurant|liquor|dessert|coffee|pantry|soup kitchen|food'
THEN 'Food and beverage'
WHEN betanyc_subcategory ~* 'clothing|fashion|shoe|goods|electronics|hardware|books|home|bicycle|supplies|pet'
OR betanyc_category ~* 'retail|bike|bicycle|store|pet|pharmacy'
THEN 'Dry retail'
WHEN betanyc_subcategory ~* 'bank|finance|government|repair|shipping|fitness|copies|law|flor|'
OR betanyc_category ~* 'laundromat|wellness|health|doctor|community service|shipping'
THEN 'Services'
ELSE 'Other'
END) as category,
betanyc_category,
betanyc_subcategory
from betanyc.latest
order by category, betanyc_category
WITH cat AS (SELECT
DISTINCT
(CASE
WHEN betanyc_subcategory ~* 'wine|beer|tea|market|food|bakery|bar|restaurant|grocery|cafe'
OR betanyc_category ~* 'bakery|bar|grocer|restaurant|liquor|dessert|coffee|pantry|soup kitchen|food'
THEN 'Food and beverage'
WHEN betanyc_subcategory ~* 'clothing|fashion|shoe|goods|electronics|hardware|books|home|bicycle|supplies|pet|stationar|building|dollar store|appliance|pharma'
OR betanyc_category ~* 'retail|bike|bicycle|store|pet|pharmacy|art|yarn'
THEN 'Dry retail'
WHEN betanyc_subcategory ~* 'bank|finance|government|repair|shipping|fitness|copies|law|flor'
OR betanyc_category ~* 'laundromat|wellness|health|doctor|community service|shipping'
THEN 'Services'
ELSE 'Other'
END) AS category,
betanyc_category,
betanyc_subcategory
FROM betanyc.latest
ORDER BY category, betanyc_category)
SELECT
*,
(CASE
WHEN category='Food and beverage'
THEN CASE
WHEN betanyc_subcategory ~* 'diner|meal*takaway|meal*delivery|restaurant'
OR betanyc_category ~* 'restaurant'
THEN 'Restaurant'
WHEN betanyc_subcategory ~* 'bakery|cafe|coffee|cream|tea|cake|bagel|sandwich'
OR betanyc_category ~* 'bakery|dessert|coffee'
THEN 'Bakeries cafes and desserts'
WHEN betanyc_subcategory ~* 'grocer|market|convenience|spice|butcher'
OR betanyc_category ~* 'grocer'
THEN 'Grocery and market'
WHEN betanyc_subcategory ~* 'wine|bar|beer|liquor|night*club'
OR betanyc_category ~* 'bar|liquor'
THEN 'Alcohol and bars'
WHEN betanyc_subcategory ~* 'pantry|soup kitchen|free'
THEN 'Free food'
ELSE 'Other food and beverage'
END
WHEN category='Dry retail'
THEN CASE
WHEN betanyc_subcategory ~* 'bike|bicycle'
OR betanyc_category ~* 'bike|bicycle'
THEN 'Bike'
WHEN betanyc_subcategory ~* 'hardware|garden|home|furniature|flor|appliance|dollar'
OR betanyc_category ~* 'hardware'
THEN 'Hardware and home goods'
WHEN betanyc_subcategory ~* 'pet'
OR betanyc_category ~* 'pet'
THEN 'Pet supplies'
WHEN betanyc_subcategory ~* 'pharma|health|drug'
OR betanyc_category ~* 'pharma'
THEN 'Health and beauty'
WHEN betanyc_subcategory ~* 'cloth|fashion|shoe|jewel'
THEN 'Apparel'
WHEN betanyc_subcategory ~* 'electronic|computer'
THEN 'Electronics'
WHEN betanyc_subcategory ~* 'book|stationery|gifts|art'
THEN 'Books art and gifts'
ELSE 'Other retail'
END
WHEN category='Services'
THEN CASE
WHEN betanyc_subcategory ~* 'fitness|gym|pilates|yoga|martial arts'
THEN 'Fitness'
WHEN betanyc_subcategory ~* 'doctor|dentist|health|medic'
THEN 'Healthcare'
WHEN betanyc_subcategory ~* 'financ|bank|tax|account|insurance|law'
THEN 'Finance and legal'
WHEN betanyc_subcategory ~* 'hair|beauty|salon'
THEN 'Beauty'
WHEN betanyc_subcategory ~* 'laundr'
THEN 'Laundry'
WHEN betanyc_subcategory ~* 'copies|copy|shipping|post|photo'
THEN 'Copies and shipping'
WHEN betanyc_subcategory ~* 'repair'
THEN 'Repair'
WHEN betanyc_subcategory ~* 'veter'
THEN 'Veterinary'
WHEN betanyc_subcategory ~* 'flor'
THEN 'Floral'
ELSE 'Other service'
END
ELSE 'Other'
END) as subcategory
FROM cat
ORDER BY category, subcategory, betanyc_category, betanyc_subcategory;
[x] Category Build logic to classify each record into one of three broad categories:
Dry retail
Food and beverage
Services which line up with DCP's storefront vacancy study.
[x] Sub-Category Build logic to classify each record into one of a more specific category borrowing from Foursquare
SELECT DISTINCT categoryname FROM foursquare.latest
Below captures some categories that might apply to BetaNYC data
Bakery
Bike shop
Bookstore
Café
Convenience store
Deli / Bodega
Drug store
Fitness center
Grocery
Hardware store
Medical
Restaurant or bar
Retail
Laundry services
Liquor stores
[x] Status
NULL / blank is mapped to NULL
NULL / blank is mapped to NULL
NULL / blank is mapped to NULL
NULL / blank is mapped to NULL