worked at Avant worked on Core Data Engineering
the company split into two companies, for 1/2 the company losing the tech infra, he had to re-build it
that is what he did, that was a long project, had 2 phases for him, fall into data engineering vs data platform
in the beginning they had nothing, get data pipelines off the ground, writing data pipelines,
run on AWS, spark on EMR, airflow
datawarehouse called Dremio
Looker for BI
first part of the experience was working with stakeholder, doing BI Engineering, developing the pipeline for the datawarehouse
after that was completed, after it had a certian amount of maturity
shifted to platform work, had all of these pipelines going
are they performing well? cost efficient? are they accurate?
moved towards cloud infra, performance engineering, AWS stuff using terraform, DB optimization, spark optimization, reliability engineering, using tools like datadog to understand the performance of the systems
drive efficiency of cost savings, through line to it would be
focus:
python dev
python tooling
use SQL, they use PySpark, combination of Python & SQL
have used a lot of SQL,
actively looking?
he left Avant middle of the summer, did massive layoffs
he hooked up an old colleague, working on IT consulting, working on a contract basis, doing data engineering for him
thats been good, contract, very much looking for something full time
worked at Avant worked on Core Data Engineering the company split into two companies, for 1/2 the company losing the tech infra, he had to re-build it that is what he did, that was a long project, had 2 phases for him, fall into data engineering vs data platform
in the beginning they had nothing, get data pipelines off the ground, writing data pipelines, run on AWS, spark on EMR, airflow datawarehouse called Dremio Looker for BI
first part of the experience was working with stakeholder, doing BI Engineering, developing the pipeline for the datawarehouse
after that was completed, after it had a certian amount of maturity shifted to platform work, had all of these pipelines going are they performing well? cost efficient? are they accurate?
moved towards cloud infra, performance engineering, AWS stuff using terraform, DB optimization, spark optimization, reliability engineering, using tools like datadog to understand the performance of the systems
drive efficiency of cost savings, through line to it would be
focus:
actively looking?
compensation: