[WIP] Fix premature logouts during parallel pushes

huyz commented 1 year ago

It seems the only thing to do here is just to remove the docker logout. This way, the docker-push works regardless of the number of post-processors run in parallel.

I don't think that the docker logout is needed, but perhaps I'm not aware of any negative consequences of skipping that.

If a user wanted to really logout at the end of the packer build, they could just do so manually. So maybe a documentation change may be warranted?

Closes https://github.com/hashicorp/packer-plugin-docker/issues/141

hashicorp-cla commented 1 year ago

All committers have signed the CLA.

lbajolet-hashicorp commented 1 year ago

Hi @huyz,

From the looks of it, the code that handles logging out was commited all the way back in 2014, and hasn't changed since. While there is no justification that I could find in the intro commit or the related issue, I presume the reason why we want to logout is so we remove the authentication information from the docker config file.

Failure to do so potentially causes a security risk since the credentials are on-disk when logging in through docker login, which is what the plugin does by default.

Now regarding this problem, I understand this is an issue when building multiple builds in parallel, as the first one that finishes the post-processing step will logout for all potential other post-processors that run simultaneously.

A cleaner way to process this would be that we logout only when all the builds finish (irrespective of success or failure), but this is non-trivial with the current architecture.

Given the security risk of not removing the credentials here, I would suggest maybe introducing an option in the configs to not invoke docker logout at the end of a publish. This would fix the issue for builds like yours since logout will not be invoked, and you will then be able to publish in parallel. For other cases, the plugin can continue behaving as it does now.

What do you think? Let me know if you want us to pick-up on this development, otherwise I'll wait for your reroll.

Thanks for first reporting the problem and suggesting this fix, we appreciate the involvement!

huyz commented 1 year ago

Actually, rather than adding a logout option, I just realized that the login is optional and defaults to false (I should have never set login to true—this is what happens when you copy and paste other people's code without reading the docs 😄). If the user is going to handle the logout on their own out of band, then it makes sense for them to handle the login the same way. So the workaround is just to keep the login to the default false.

So the quickest win would be to add a warning to that effect in the documentation, which I can file as a separate PR.

Since this PR was never intended to implement a proper logout *only once when all post-processors were done), which I figured could be difficult to implement, I'll close this PR.

Btw, here are a couple of interesting notes, for anyone who wants to pick up from here:

1) The code comments from 2014 say https://github.com/hashicorp/packer-plugin-docker/blob/e2ec64f91ace29c4a29c26888390f5238a539043/builder/docker/driver.go#L35-L37 But that comment about "lock the driver" could be misleading or even just incorrect (maybe it used to be correct). That's because if you read my logs at https://github.com/hashicorp/packer-plugin-docker/issues/141#issue-1565916773 you can see that the second docker-push seems to be able to login and in fact push while the first docker-push has logged in and pushed.

2) Btw, there may be another solution at the technical level, maybe worth considering. A temporary Docker config directory is created by each iteration of the post-processor: https://github.com/hashicorp/packer-plugin-docker/blob/02122f256bdf61a3f4904d9a27c7367169f653e3/post-processor/docker-push/post-processor.go#L77-L78

If I understand https://docs.docker.com/engine/reference/commandline/login/#configure-the-credentials-store correctly, maybe it's possible to set the `credStore` to something other than a centralized, external credentials store.  Maybe having it set to `dummy` somehow would force the credentials to be written to file in that isolated Docker `config.json`, which each docker-push instance maintains separately. Of course, that's less secure because it's in plaintext (plus, there's no guarantee that a docker logout would be performed 100% of the time).

To summarize, for my use case, handling docker login on my own is just fine and code changes aren't worth it.

lbajolet-hashicorp commented 1 year ago

Regarding the locking statement, there is indeed a Mutex that is being locked on login, and released on logout, so while the documentation is accurate regarding how things work with respects to the plugin, it is probably outdated.

My assumption is that this model worked as advertised in older versions of Packer, when plugins did not exist, and the codebase for handling builders/provisioners/post-processors was embedded in Packer itself, but this doesn't work anymore now that plugins are separate entities that are booted by packer.

With how things work nowadays, each separate build will have their own instances of the plugin, hence why the lock/unlock approach does not prevent logging-in/out when performing builds in parallel, since they each have their own memory space, and their own mutexes.

This does point out that we should think about that piece of code though, since it might have been relevant some time ago, but is not anymore.

Again thanks for the investigative work here @huyz, we'll see what we can do about this, glad you found a way to make it work for your use case and for making it clearer what are the implications of using login = true here.

huyz commented 1 year ago

@lbajolet-hashicorp Ah very interesting. Thanks for the explanation.

hashicorp / packer-plugin-docker

[WIP] Fix premature logouts during parallel pushes #142