We have created several data table in mart and s3 bucket where we have meta data along with different performance metrics. one of the meta data called county which is currently reporting county numerical code instead of county full name. However, we noticed that it is often difficult to understand the county name from this numerical code in the data set as well as for visualization we need county full name! We have county name list along with county code that is sitting seeds/counties.csv. We just need to use this seeds file to bring county full name along with county code in all mart and s3 tables. Literally if we update all tables in analytics_prd that should be suffice for the users as well as our internal visualization.
Recently we also noticed that it is time consuming and memory failure issues in logstash to have inner join between meta data and performance metric, therefore it is also necessary to report the latitude and longitude of each station/detectors for all mart level data report (if applicable). The goal is having ready to absorb data in mart, so that we can directly ready and push the data to elastic search instead of doing data processing within the logstash.
We have created several data table in mart and s3 bucket where we have meta data along with different performance metrics. one of the meta data called county which is currently reporting county numerical code instead of county full name. However, we noticed that it is often difficult to understand the county name from this numerical code in the data set as well as for visualization we need county full name! We have county name list along with county code that is sitting seeds/counties.csv. We just need to use this seeds file to bring county full name along with county code in all mart and s3 tables. Literally if we update all tables in analytics_prd that should be suffice for the users as well as our internal visualization.
Recently we also noticed that it is time consuming and memory failure issues in logstash to have inner join between meta data and performance metric, therefore it is also necessary to report the latitude and longitude of each station/detectors for all mart level data report (if applicable). The goal is having ready to absorb data in mart, so that we can directly ready and push the data to elastic search instead of doing data processing within the logstash.