Open Tom-Willemsen opened 2 years ago
Is the key difficulty here that we cannot use a context manager in a notebook in a convenient manner?
I think we should discuss alternative approaches for running the data streaming infrastructure outside a notebook environment, for example running the stream listener as a standalone background python task, and having the data streaming plot be a matplotlib plot outside of a notebook environment.
Is there anything in the current implementation that prevents this? That is, is it either-or?
Is the key difficulty here that we cannot use a context manager in a notebook in a convenient manner?
I think a context manager would only help if the code inside the context manager was blocking? If it's an asyncio
call then it wouldn't help as we still wouldn't have a way to close the old threads/asyncio tasks when the cell gets re-run.
Running blocking code in the notebook I feel is not the right approach - even if the issues with plot interactivity could be fixed, there are other issues when trying to re-run cells containing blocking code (need to explicitly break the interpreter, wait for it to timeout, then re-run the cell).
Is there anything in the current implementation that prevents this? That is, is it either-or?
I think there's probably not anything specific in the implementation that prevents this, beyond the need to produce, test and document an "alternative" approach (if that's what we decide we want).
I'm not currently convinced that the overhead of maintaining both solutions would be worth it, but happy to have my opinion changed on this - what do you see as the advantage of the current solution which we couldn't reproduce in some alternative (e.g. standalone) solution? I guess maybe scientist familiarity with the notebook environment?
If it's an asyncio call then it wouldn't help as we still wouldn't have a way to close the old threads/asyncio tasks when the cell gets re-run.
Wouldn't the context manager's __exit__
take care of that?
what do you see as the advantage of the current solution
I do not know enough about the current state... is there an implemented solution, apart from just something that shows how this is possible in a notebook?
which we couldn't reproduce in some alternative (e.g. standalone) solution?
Not having to write a custom application. But I do not have enough information to tell whether this is really simpler with Jupyter plus, e.g., Voila.
One thing that currently only works in a notebook is the instrument view, so if live streaming into and instrument view is a must have (I don't know if it is, maybe it's not the most useful visualization to have), then we still have to do things in a notebook, or at least voila.
Wouldn't the context manager's exit take care of that?
I'm not sure how this helps. If the task is blocking, then the context managers' __exit__
will never be called as the task gets forcibly terminated by jupyter if it's still running after a timeout, I believe. If the task is non-blocking, then the __exit__
would be called immediately?
There may be some hook in jupyter/ipython where we can listen for a "stop" event, but I didn't find one yet...
I do not know enough about the current state... is there an implemented solution, apart from just something that shows how this is possible in a notebook?
There is some code that displays data-streaming specific widgets in a notebook environment, for example. Parts or all of that might need to be rewritten if we decided to use a different solution. The plotting code should in principle be runnable outside a notebook, but is likely to need tweaking as it's only ever been tested in notebooks. But other than that, I'd say most of the underlying code is independent of running in a notebook or not.
Partially related to this discussion, the consensus is that the current requirements for data streaming are too fuzzy and maybe too ambitious. It appears to stop us from make actual progress. Therefore:
We are running into an increasing number of issues/complications with running the data streaming code within a notebook environment, for example:
There are also potential complications with having data streaming in a notebook from a data analysis perspective, where we would want to re-stream reduced data for consumption by analysis programs. We feel that having this in a notebook is error-prone if a user changes the notebook mid-stream.
While these issues may all be fixable, it feels like we are not using notebooks "as intended" here and therefore are exposing ourselves to more bugs/complications than necessary.
I think we should discuss alternative approaches for running the data streaming infrastructure outside a notebook environment, for example running the stream listener as a standalone background python task, and having the data streaming plot be a matplotlib plot outside of a notebook environment.