Open ibnesayeed opened 4 years ago
Thank you for the well thought out proposal. I completely agree this would be an awesome feature.
Go, unfortunately, makes this a bit difficult because forking isn't possible to do safely. docker container
can do this because it's actually just firing off an HTTP request. To do this in ipfs, we'd need to:
I would confess that I do not have enough experience around system calls in Go. After your comment I looked at the literature around this issue and found that the situation is not very encouraging. There were proposals for this in the language itself, but people seemed to be reluctant in baking it in the language due to issues when it comes to multi-threading. I have seen an external package for this, but that one does not seem to be cross-platform. That said, allow me to think out loud below:
- Determine the correct location of the daemon (error prone but doable).
Can't we rely on os.Args[0]
for this?
- Re-exec the daemon.
Alternatively, a companion wrapper binary (e.g., ipfs-daemon-runner
) can be shipped for this particular use-case, but this will be no better than having an execution branch in the main binary to handle detached mode. While daemonizing self is tricky, doing so on another instance of self should be alright, especially, if it is done before acquiring any resources (such as, a port) that may cause a conflict in the subsequent call.
- Poll the API until ready (or grep stdout?).
I think, inspecting STDOUT
would be good enough in this case, as it will be a behavior local to the process, which can adapt to any changes in the string (this is not a luxury an external tool will have).
Can't we rely on os.Args[0] for this?
You can look at os.Args[0]
and the PATH
to try to guess which binary was invoked, but it's not foolproof. On linux, it's best to look at /proc/self/exe
.
Take a look at https://godoc.org/github.com/docker/docker/pkg/reexec.
Alternatively, a companion wrapper binary (e.g., ipfs-daemon-runner) can be shipped for this particular use-case, but this will be no better than having an execution branch in the main binary to handle detached mode.
Yeah, building this into the main ipfs
binary is probably simpler.
I think, inspecting STDOUT would be good enough in this case, as it will be a behavior local to the process, which can adapt to any changes in the string (this is not a luxury an external tool will have).
:+1:
You could run ipfs as a user service with systemd. Since ipfs got notify-support with 0.5.0, the systemctl --user start ipfs.service
call will return when the daemon is successfully started (or failed to start).
@RubenKelevra thanks for this info. However, integrating services with an init system is not always an option or preference. For example, when running in a container, setting systemd
up inside will be a big hammer and a repeated overhead.
@ibnesayeed There's no need to setup a container. You can setup ipfs as a system daemon, this way systemd offers a lot of additional hardening options to capsulate the process from the rest of the system.
Take a look at the hardened service file.
@RubenKelevra containers are not always used as needs, they have their own advantages to offer in complex deployment where IPFS is just one of many components. If I have a cluster of nodes where I deploy all my services like web servers, reverse proxies, databases, caches, and whatnot using containers, I would perhaps have hard time setting up IPFS daemon on the host machine directly outside of the overlay network of containers. While IPFS may run fabulously when setup on the host with a powerful init system in place, saying that there is no need to run it in containers is perhaps a little too limiting. Testing is another use-case where running instances in throw away containers is wonderful which ensures that no state information is shared across consecutive runs, unless explicitly done so.
Sure, containers are nice. I was just referring to the capsulation a container do, to protect the rest of the system. There's no need to use a container and setup systemd inside of it, to get these advantages.
Which container solution is affected by the inability to detach, since docker seems to be fine if I understand that correctly?
@RubenKelevra: Which container solution is affected by the inability to detach...
This is a feature request, not a bug report. I did not say that running IPFS in containers is causing any issues as such. If you read the first post above, I have described the need or advantage of having this ability with the help of two real examples where I wished I could detach it. It is possible that what I was trying to achieve in the two examples noted above could have been done more elegantly, but offloading this ability to the daemon itself would save everyone from adding some boilerplate code in many applications in different languages for the same objective.
I see.
If the detaching isn't easy to achieve with Go, we might want to avoid doing all kinds of boiler plating by implementing a command which just returns when the API is reachable, while being able to specify a timeout for this.
So maybe something like this would be easier and less error prone to implement:
ipfs daemon &
ipfs check-api --timeout "2m"
We should have a --detach
flag. It's not particularly hard and has been requested several times at this point.
@RubenKelevra what you suggested is going to solve the repetition of the boilerplate code, but the situation I described in the second example where I had to give up on the STDIO
access of the subprocess from NodeJS will continue to exist. In addition to that, the check-api
sub-command needs to accept more arguments such as port number and host, if those were customized in the original daemon command. Moreover, someone has to decide a timeout, which will be yet another arbitrary number and we will have no cleanup ability after timeout occurs. On the other hand, if this ability is built in the daemon itself, then a timeout configuration may ask the daemon to kill itself if it fails to boot up within that limit.
That said, what I understood from the earlier conversation between @Stebalien and me and from the literature online, it is doable here and not too difficult. Situations where forking and detaching could be an issue in Golang may not apply to IPFS daemon and platforms where locating the binary in question might be unreliable are not the primary ones where we expect many IPFS servers to be running. If we do come across a situation where it is unreliable, we may fix it by improving the documentation and ask the user to use absolute path of the binary instead.
I propose
-d/--detach
flag toipfs daemon
command to automatically fork and detach the process after it is ready to serve. The parent process should take care of the initialization and any reporting on theSTDOUT
before detaching the child, if successful, and report any issues if unsuccessful. This has been discussed briefly in #5983.This is often desired to wait for the daemon to be ready before executing any dependent processes. Current workaround is to run the process in background (i.e., using
&
suffix in *nix environment) and observe the process using an external utility, which is neither ideal nor robust enough, and requires adding boilerplate code in many applications to do the same thing.For example, in IPWB we do the following:
This mostly works, but had some quirks:
curl
This mostly works for us because we are running this whole process in an isolated container where external factors causing failure or collisions are minimal, but others may not have the same luxury.
@Stebalien suggests a workaround in #5983:
However, this approach has some drawbacks. Relying on such string matching is a dangerous proposition as it is brittle. For example, if in a future release, IPFS daemon decides that the phrase can be improved, which is not an unlikely scenario, then application relying on this will start to break. Unixy way of checking status of a process is status codes, not returned messages.
Here is another situation that I encountered recently when creating the IPFS Setup Action to be used in GitHub Workflows CI:
This time, I did not want to wait forever, so I added a hard limit on how many maximum tries it would make to probe the IPFS daemon status. However, when spawning the process in NodeJS, I had to set
stdio
toignore
, otherwise the parent process will continue to wait even after the child is detached. As a side-effect, now I cannot readSTDOUT
of the daemon, which, if I could, I would have set some output variables (such as reported connection information and URIs) that could be useful in the following steps of the workflow pipeline.@Stebalien suggested
--wait
or--fork
flags for this, which are equally as appropriate, but I have seen-d/--detach
to be more commonly used for this purpose (e.g., indocker container run
).