sul-dlss / DeveloperPlaybook

A place to organize style guides, best practices, tools, and techniques for Stanford University's Digital Library Systems & Services group
21 stars 5 forks source link

Best practices for pipelines (Scala/Akka/Kafka/Spark) #70

Closed dazza-codes closed 6 years ago

dazza-codes commented 7 years ago

@Maatary is driving work on LD4P to leverage big-data pipelines for metadata wrangling. This includes working with scala, kafka and spark (and their dependencies, like zookeeper). We will need to document some developer and dev-ops best practices for these, including:

Potential contributors: @Maatary @eefahy

atz commented 7 years ago

I think a lot of what we will be able to contribute here is "adopted conventions" in accordance w/ existing best practices principles.

My understanding, in the scope of the one project advancing this stack, about tools selected in some of the areas enumerated:

dazza-codes commented 7 years ago

Best Practices in Scala Programming Guidelines for Ensuring a Professional Scala Code Base By Joshua Backfield OReilly - http://shop.oreilly.com/product/0636920051336.do

dazza-codes commented 7 years ago

@cmh2166 - tagging you to raise awareness of this as a deliverable for the LD4P pipeline sprint, with regard to the adoption and documentation of developer best practices.