A way to correlate pre- and post-hook for the same API invocation

When implementing powestrip-logfiles I realized the need to be able to correlate the callbacks for pre- and post-hooks for a single API invocation. Here's what I would like to do:

powerstrip-logfiles addresses the need to centrally collect in-container log files. To that end, I would like to be able to use bind mount tricks to create a single hierarchy indexed by container ID, with the log files to be collected for each container available in the central place in a directory named after the container ID. Most if not all log file collection agents will add the name of the collected file as meta data, so having the container ID in the path is very useful. Here's my idealized directory hierarchy, expanded for an Nginx container to illustrate where the container log files would be:

/var/log/container-logfiles/containers/08db6f63bbca
                                      /8cfc6162a939
                                                   /var
                                                       /log
                                                           /nginx
                                                                 /access.log
                                                                 /error.log
                                      /a05c81291881

In order to accomplish this I need to be able to create a directory named after the container ID. The problem here is that the container ID will only be known after the call to the Create API has returned. However, at this point, I need to have the directory created already, otherwise I can't bind mount it into the container as part of the Create API call.

In other words, I have a pre-hook that figures out which files are log files (indicated currently by the user in the LOGS environment variable). Then I create a directory, and under this directory I bind mount the log files. Because I have to do this during the pre-hook, I don't know the ID of the container. So I create a directory with a random name.

As part of the post-hook, I would like to create a symlink to the directory I created pre-hook, and the symlink should be named after the container ID. Of course, as I am in the post-hook now, I can get the container ID. However, how do I figure out what directory I created in the pre-hook? The callbacks in Powerstrip are stateless...

I hacked together a solution that shoves the directory path I need to symlink through to the post-hook by synthesizing an environment variable that I then try to read in the post-hook. But currently this doesn't work, because the modified client request, to which I am adding my environment variable, is not passed to the post-hook. The post-hook only sees the unmodified client request.

I worked around this by patching Powerstrip - see the diff here and then doing this as part of my code (line 95ff). I discussed this with @lukemarsden and he suggested I open this issue. To make a long story short - I would like to have the ability to correlate pre- and post-hooks. Here's two possible approaches:

Pass the modified client request to the post-hook, along with the original client request, and the server response. Technically, this is pretty simple (see above), but it changes the protocol. This approach works for my use case, but sneaking stuff through the modified client request is probably not the worlds most elegant approach. Personally, I can't think off the top of my head in which other cases access to the modified client request would be required in the post-hook, but I will say that not actually having it there was a bit of a surprise to me initially, FWIW.
Have Powerstrip pass a unique request ID to pre- and post-hook for each API invocation. The implementer can then maintain state per request based based on for example a simple map indexed by the unique request ID. This also changes the protocol, but is equally easily to implement in Powerstrip, and seems like a pretty good solution for my use case anyways (cleaner, actually).

As a way to workaround this without changing Powerstrip, could you inspect the container (api call to /containers/:id/json) in the post hook to extract the env var you set in the pre hook?

(You can bind-mount the docker socket in when you start the adapter.)

@progrium what are your thoughts on this use case? On 14 Mar 2015 21:32, "raychaser" notifications@github.com wrote:

When implementing powestrip-logfiles https://github.com/raychaser/powerstrip-logfiles I realized the need to be able to correlate the callbacks for pre- and post-hooks for a single API invocation. Here's what I would like to do:

powerstrip-logfiles addresses the need to centrally collect in-container log files. To that end, I would like to be able to use bind mount tricks to create a single hierarchy indexed by container ID, with the log files to be collected for each container available in the central place in a directory named after the container ID. Most if not all log file collection agents will add the name of the collected file as meta data, so having the container ID in the path is very useful. Here's my idealized directory hierarchy, expanded for an Nginx container to illustrate where the container log files would be:

/var/log/container-logfiles/containers/08db6f63bbca /8cfc6162a939 /var /log /nginx /access.log /error.log /a05c81291881

In order to accomplish this I need to be able to create a directory named after the container ID. The problem here is that the container ID will only be known after the call to the Create API has returned. However, at this point, I need to have the directory created already, otherwise I can't bind mount it into the container as part of the Create API call.

In other words, I have a pre-hook that figures out which files are log files (indicated currently by the user in the LOGS environment variable). Then I create a directory, and under this directory I bind mount the log files. Because I have to do this during the pre-hook, I don't know the ID of the container. So I create a directory with a random name.

As part of the post-hook, I would like to create a symlink to the directory I created pre-hook, and the symlink should be named after the container ID. Of course, as I am in the post-hook now, I can get the container ID. However, how do I figure out what directory I created in the pre-hook? The callbacks in Powerstrip are stateless...

I hacked together a solution that shoves the directory path I need to symlink through to the post-hook by synthesizing an environment variable that I then try to read in the post-hook. But currently this doesn't work, because the modified client request, to which I am adding my environment variable, is not passed to the post-hook. The post-hook only sees the unmodified client request.

I worked around this by patching Powerstrip - see the diff here https://github.com/ClusterHQ/powerstrip/compare/master...raychaser:master and then doing this as part of my code (line 95ff) https://github.com/raychaser/powerstrip-logfiles/blob/modifiedclientrequest/response.js. I discussed this with @lukemarsden https://github.com/lukemarsden and he suggested I open this issue. To make a long story short - I would like to have the ability to correlate pre- and post-hooks. Here's two possible approaches:

1.

Pass the modified client request to the post-hook, along with the original client request, and the server response. Technically, this is pretty simple (see above), but it changes the protocol. This approach works for my use case, but sneaking stuff through the modified client request is probably not the worlds most elegant approach. Personally, I can't think off the top of my head in which other cases access to the modified client request would be required in the post-hook, but I will say that not actually having it there was a bit of a surprise to me initially, FWIW. 2.

Have Powerstrip pass a unique request ID to pre- and post-hook for each API invocation. The implementer can then maintain state per request based based on for example a simple map indexed by the unique request ID. This also changes the protocol, but is equally easily to implement in Powerstrip, and seems like a pretty good solution for my use case anyways (cleaner, actually).

— Reply to this email directly or view it on GitHub https://github.com/ClusterHQ/powerstrip/issues/67.

I managed to implement it using an inspect call back to Docker. I was afraid about reentrancy initially but then I realized that I am calling Docker, not the proxy. In short, it is now working the way I envisioned. Check it out: https://github.com/raychaser/powerstrip-logfiles

I think allowing for pre- and post-hook correlation is still potentially useful, but I managed to do what I wanted to do for now.

Awesome. Sounds like this will do for now for your use-case, but we can take the correlation requirement into account when designing the official Docker API extension mechanism.

cc @progrium @binocarlos @shykes @icecrime @aluzzardi

Sweet. Thanks for the idea, Luke. Closing this now.

ClusterHQ / powerstrip

A way to correlate pre- and post-hook for the same API invocation #67