simonsobs / sisock

Sisock ('saɪsɒk): streaming of Simons Obs. data over websockets for quicklook
Other
2 stars 0 forks source link

Make OCS agents for sisock servers #6

Closed ahincks closed 4 years ago

ahincks commented 6 years ago

Sisock servers should be started/stopped/(monitored?) by OCS agents. We should add this functionality.

@BrianJKoopman, this might be a good project for me to do as it will get me up-to-speed on the OCS. But if it's something for some reason you'd like/need sooner rather than later and want to implement, let me know and you can go ahead.

BrianJKoopman commented 6 years ago

@ahincks I'll put it on my list, but there are some other things I need to prioritize first. If I start working on it I'll let you know (and keep a working branch pushed).

BrianJKoopman commented 5 years ago

I'm not sure this should be the plan anymore, since DataNodeServers are being run in Docker containers. Unless it would be preferred that containers are started/stopped by an OCS agent?

There are modes of running Docker (and many other softwares, such as Kubernetes, perhaps the most popular of these, made by Google) which work to ensure a configured state of a set of containers. Perhaps that should be explored?

ahincks commented 5 years ago

My thinking was that it would be good to have the DataNodeServers integrated into the OCS so that we can start/stop/monitor them with that one system, as opposed to having another way of switching them on and off.

At the same time, what I think we should avoid is having lots of layers, or, perhaps better, a system that is as robust as possible, since live monitoring is fairly critical to the experiment. Right now I'm wondering if the layer that Docker adds will make things more robust in the long run or whether its one more thing to be maintained that might fail or cause incompatibilities or disruptions down the road. (Nothing against Docker in principle, but recall that I'm a noob!)

Does all the OCS stuff just run in the normal environment, or are they also wrapped by something like Docker? Or is sisock a particular case that benefits hugely from Docker?

BrianJKoopman commented 5 years ago

Is there need to turn off a DataNodeServer once it's running? In my use so far with the diode calibration at Penn the sisock end is started and left running all the time, while OCS Agents are commonly stopped and restarted during testing (and then left to run during long data acquisition.) The hub already monitors the DataNodeServers, should OCS also be monitoring them? (Not necessarily disagreeing with you on either of these points, just raising the questions for discussion.)

I agree that having lots of layers to run should be avoided (and so I don't think OCS should be controlling docker containers, but not only for that reason). In terms of robustness, I think Docker and the associated tools (like I mentioned, Kubernetes, or Docker Swarm) allow for configurations which are robust. For example, if hardware for a running system fails they can spin-up equivalent containers on a different computer for continued operation. Certainly there's something to be said about adapting to using the containers (for one I haven't quite figured out how to debug them well yet) and the learning curve that comes with them, but I do think the system adds more than it inhibits.

All the OCS stuff does just run in the normal environment. The topic of daemonizing Agents has come up recently, with the proposed plan being to somehow plug them into systemd's services. However, the thought of Docker-izing things, particularly Agents, has crossed my mind. There are aspects of some Agents, like the USB communication with the Lakeshore 240's, that have given me pause though. If everything an Agent ever communicated with was just done via the network, then I think it would work well.

BrianJKoopman commented 4 years ago

This issue is pretty stale at this point. Data servers are started/stopped by docker-compose. If other situations needs to be handled, a new issue should be opened, or this one should be reopened. Closing for now.