newmanrs / cloudburst-graph

Creating a beer and hop graph from Cloudburst Brewing Seattle's beer descriptions with visuals in Neo4j and Neo4j Bloom
MIT License
2 stars 0 forks source link

Investigate sources of hop varietal information #4

Open newmanrs opened 3 years ago

newmanrs commented 3 years ago

Hops are currently just a list, but it might be nice if the database could be populated with more information about the varietals (year/place developed, if found in wild, bred from another, etc).

newmanrs commented 3 years ago

Maybe scrape yakima chief website. https://shop.yakimachief.com

Or their catalog PDF - https://shop.yakimachief.com/media/wysiwyg/Yakima_Chief_Hops_Varieties.pdf. Mostly of hops they grow, but it also contains in descriptions mentions of daughter/sister for what varietals were bred from. Horrible regular expression might get the relationships.

newmanrs commented 3 years ago

655ac5222ff60a3d033acc00827877b4e4587ed7 makes a json object out of the Yakima Chief hop PDF. Extracts hop names, proprietary ID if provided, tasting notes, country of origin, and styles.

Still need to load to database. Jupyter notebook needs some work if I want to get the alpha acids or other chemistry data on the hop - PDF has some quirky formatting that is mostly standard, but missing for some pages requiring a rats nest if statement snarl.

At this point I'll note to self that I may want to see what other sources I can scrape/merge since this only covers what Yakima Chief sells. For instance Zappa is proprietary to a competitor CLS farms, and sold at Yakima Valley Farms. Other brewing supply sites have descriptions too. There seems to be an industry standard for at least some of the aromatics and oils used by most sites/vendors.