USGS-VIZLAB / internal-analytics

Creative Commons Zero v1.0 Universal
3 stars 13 forks source link

Re-think how to save GA data #224

Open ldecicco-USGS opened 5 years ago

ldecicco-USGS commented 5 years ago

We had to bump up how much memory the Jenkins machine had to get the app to work. It might make sense to split up getting the data and saving it to S3 to 1 job, and make the viz (which only needs 1 year) another job. (that being said, I think with the increased memory and turning on compression, I think we're fine for awhile).

But...we should probably be thinking about how we save and use the data a bit more. We certainly are thinking about many applications where more than 1 year of data is needed.

ldecicco-USGS commented 5 years ago

Maybe we can use scipiper to do the data pulling, and vizlab for the ...viz

wdwatkins commented 5 years ago

Yeah I think separating the data pull and viz build is probably good. We are planning to add some metrics soon, which may be a good time to rethink how the data is pulled also (perhaps make it more modular?). The new metrics will have to be pulled for the entire time period — we could re-pull everything, or just the new metrics and join it on

wdwatkins commented 5 years ago

Re: system resources, we should probably consider dockerizing if we were going to pull out the data pull