VRazgaitis / climbing_database

0 stars 1 forks source link

"Routes" data - Location string is inconsistent on MP #1

Open VRazgaitis opened 3 months ago

VRazgaitis commented 3 months ago

Our route dataset comes from Mountainproject.com (MP)

It looks like MP is inconsistent with how they define regions and subregions for the location of a climbing route. For example, the longest location is: Motherlode Rock - North Face > Motherlode Rock > Central Pinnacles > Holcomb Valley Pinnacles > Big Bear City Area > Big Bear North > Big Bear Lake Area > San Bernardino Mountains > California

A shorter location is: Bob's Rock > Buena Vista > Colorado

Why this is a problem: Cleaning the location string is not really possible, meaning that it's not query-able

Solutions:

  1. HARD: MP gives latitude, longitude for each route. We could do a 'classifier' (ML sidequest?) that categorizes the lat longs based off of a geo bounding box, and sorts them into our manually defined climbing defined regions
  2. EASY: extract the state from the location string, use that as the location
  3. EASY: Pull climbing area csv data manually (Ex: Yosemite, Red River Gorge, etc), and encode the area with each data push