treasureapp / backend-scala

0 stars 0 forks source link

Implement generic time series data model #3

Open grahamcrowell opened 7 years ago

grahamcrowell commented 7 years ago

analytical model

Subject

Attribute

State

Event

Process

Time Granularity

Sent from my Samsung SM-J120W using FastHub

grahamcrowell commented 7 years ago

rationale

generic/abstract model

Metrics basically take a time series and a perform some calculation and return value of the metric for a given date.

Concrete examples of model components

Subjects

grahamcrowell commented 7 years ago

Process as spark dataset

Denormalize price data into Spark Dataset[State]

case class State(objectLabel, processLabel, dateId, value) 

Start with strings determine if performance blocks progress

grahamcrowell commented 7 years ago

analytical data structure

single time series

system of time synchronized processes

Map symbol to asset

grahamcrowell commented 7 years ago

@grahamcrowell this is basically the same as Spark's machine learning pipeline. Spark ml lib pipeline docs

grahamcrowell commented 7 years ago

setup aws for hadoop http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-gs-prerequisites.html