Closed jkarpen closed 1 year ago
Meeting scheduled with Zhenyu Zhu for 10/2 @ 3:30 to start process of getting access to data.
We had our discovery call on the Caltrans network on 10/2:
I shared ODI's understanding of the network (which was more of a conceptual diagram):
This is broadly correct, but Caltrans was able to put together a more detailed network diagram, which includes some proposed pieces which do not exist yet:
The current ETL jobs within Caltrans rely on a not-very-well-documented "data intake layer", with "lots and lots of perl scripts". Caltrans staff are currently trying to get a better understanding of how data flows through the intake layer, today it is a bit opaque.
The Caltrans proposal is to forward the "Raw" 30-second data from the "data intake layer" through a "Data relay server", and then on to an ODI S3 bucket. This would happen in parallel to the existing data pipelines. All servers would be owned and managed by Caltrans.
A basic sequence of events would be:
Closing as network discovery is complete (at least for the purposes of this meeting). More specific issues will be opened for follow-up work.
@ian-r-rose will drive discovery around CalTrans network architecture, creating this issue to track that work..
10/13/23 - Next step: capturing some items from the notes doc here.