hestiaAI / hestialabs-experiences

HestiaLabs Data Experiences & Digipower Academy
https://digipower.academy
Other
7 stars 1 forks source link

Help Lua with sensemaking of Uber data #1188

Closed pdehaye closed 1 year ago

pdehaye commented 1 year ago

Lua is currently becoming a bottleneck for sensemaking of Uber data. He is overwhelmed with the complexity of file formats, analysis needs etc. We have for now solved this problem by suggesting that he looks at time, kms and money as different facets to be merged later.

It is clear that Francois could come in support. In fact he has already suggested that, adding some form of Timeline Viewer in Jupyter to - I think - help make sense of different files alongside a same timeline. Note that this need has been discussed before (issue link?) and that in this particular case we always want to first and foremost compare events or periods on the same timeline in relation to file-specific Turing machine states (i.e Online/Offline file defines this differently than DriverStatus).

I would suggest that we consider using Francois' suggestion (and he can start working on it on his own already), but we also consider asking Valentin and Heloise to work on binding that work to Fibery (i.e. using Fibery as a CMS for configured Timeline views), because that would expand the range of realistic domain expert contributors (to myself and Jessica, first and foremost), considering that they might be closest to lawyers or drivers.

Additionally, this Fibery spec of the files can also serve as a spec for the ingesting of data in the experiences pipeline, with more fluidity than Hugo's tool.

Does that make sense?

pdehaye commented 1 year ago

I have tried a very poor effort at Fibery: https://hestiaai.fibery.io/Uber_driver_onboarding/New-whiteboard-220

pdehaye commented 1 year ago

Also this table (and tables connected to it) could provide more clarity to Lua and could be collaboratively constructed for great efficiency, and reused for the data pipelines. https://hestiaai.fibery.io/fibery/space/Data_tooling/database/Data_columns

pdehaye commented 1 year ago

It feels like it is the arity and the different facets which are overwhelmingly difficult.