pixiedust / pixiedust

Python Helper library for Jupyter Notebooks
https://pixiedust.github.io/pixiedust/
Apache License 2.0
1.04k stars 162 forks source link

As a data scientist, I want that my published PixieApp (via PixieGateway) gets automatically refreshed in case new data comes in, so that it is interactive and always up to date. #580

Open danieljaeckibm opened 6 years ago

danieljaeckibm commented 6 years ago

Expected behavior

It would be perfect, when the published PixieApps (either via Kubernetes on IBM Data Science Experience or on localhost) could react to scheduled jobs (in case new data is appended to a database table for instance) or even streaming data continuously flowing in.

Actual behavior

So far I am not able to make this happen for a published PixieApp. I am aware, that it is possible to stream data within a Notebook (e.g. Twitter sentiment analysis demo), but in order to make it more accessible to the end user, I want to be able to also publish this kind of App.

Thanks in advance - PixieDust rocks!!

DTAIEB commented 6 years ago

@danieljaeckibm: Are you referring to PixieApps, Charts or both? -For charts it's rather simple, we'll add scheduling options to refresh the data periodically. -For PixieApps: I've been thinking about addressing this use case for a while. One way I'm leaning towards would be to provide a mechanism for periodically re-run the warmup code of the PixieApp. The idea is that the warmup code loads the static data used by each PixieApp instance driven by user requests. The tricky part is to re-load the warmup code without interrupting current user sessions.

Of course the assumption is that data changes so infrequently and therefore a pull model with scheduled refresh would be enough. I'm curious to know what others think and whether we also need a push model with event notification when data changes.