Lucretius / terraform-provider-drone

A terraform provider for drone.io
MIT License
12 stars 6 forks source link

Configure default TLS config with root CAs #6

Closed alonsodomin closed 4 years ago

alonsodomin commented 4 years ago

Getting some weird transport socket errors when using the provider from Terraform Cloud. Since our server RPC is exposed publicly on HTTPS using LetsEncrypt certificates, the kind of error seen makes me think about the need to configure the TLS communication using the OS root CAs.

The code in here is mostly taken from the actual implementation of the Drone CLI, which is able to communicate with the server from Terraform Cloud.

Lucretius commented 4 years ago

Interesting - just to be sure I understand the issue - you are trying to run your build in Terraform Cloud and the provider itself is unable to be configured? Or do you have a defined Drone provider resource that is unable to get built? I have worked with Terraform Cloud before and I know they've recently exposed their IPv4/IPv6 sets that you can whitelist for your own internal networks. If SSL certificates are an issue we may want to have the option to pass in a specific CA certificate and have the provider use that, I'd have to think about the approach.

alonsodomin commented 4 years ago

Sorry I didn't explain myself well. In Terraform Cloud we have a bunch of workspaces for each of our environments, one of them is what configures the internal tooling we use for development and similar. These are environments built on Kubernetes and the one that holds the tooling, installs Drone into kube and exposes it via an ingress using TLS certificates automatically issued by LetsEncrypt.

So, once Drone is installed and running, it is reachable under, let's say, https://drone.company.com. This works fine and performs the OAuth authentication step with Github as it should and all that. Now we have separate phase in which we want to configure it, like setting up secrets. For that, right now and since the provider wasn't available in the Terraform Registry, what we do is using a terraform null_resource that runs a script that downloads the Drone cli and invokes it locally. Quite a dirty hack but it works, meaning that the HTTPS url receives the request from the CLI, does its thing, and returns a 200 after each operation.

I wanted to get rid of that hack and use a proper provider implementation, like this one. When configuring the provider in Terraform, it seems that it downloads fine but it looks like when the resource make the appropriate call to make the change in the server side, the response we get is a rpc error = Unavailable.

To be honest, we haven't fully pinned down if the issue is with the way the client is used inside the provider or even with Terraform cloud not being able to fetch it (I think that former is more likely). But when going through the code I decided to compare it to the way the drone/drone-cli uses it and I found that difference regarding the loading of the root CAs.

Now, I believe too that if the error is regarding the certificates, we should see a SSL Handshake error instead of a Unavailable, but I'm discarding the fact that our firewall is rejecting the request because when use the dirty hack with the same URL, it all works.

For reference on the code that I introduced to load the Root CAs from the local machine, please see this code which is the one being used by the Drone CLI: https://github.com/drone/drone-cli/blob/bed84a32ff0f565b4bd6176e1d87c9b33ec83ed0/drone/internal/util.go#L18

Lucretius commented 4 years ago

Interesting, thanks for the context. I think since this provider is in its early days and unlikely to be used by anyone yet it's probably fine to deploy this to see if it works for you. I do wonder why the Drone client makes the assumption that the trusted root certs of the server it runs on contains the CA for the Drone server - this doesn't seem like it will necessarily be the case and in your case, we're hoping that the Terraform Cloud instance has trusted the cert (it probably will since it's Let'sEncrypt). If the TLS config allows us to pass in a more clearly defined and specific TLSConfig it would probably allow us to make a generic solution for anyone who has SSL issues and runs their client on shared infrastructure as is the case with Terraform Cloud. Like pass in a specific cert/reference a specific file containing said cert. What do you think?

alonsodomin commented 4 years ago

It does sound a good approach to me. My suspicion is that the changes I'm adding are just loading the root CA certificates and enabling the underlying HTTP client to resolve the certificate chain.

Lucretius commented 4 years ago

Let's try your change first to see if it resolves your issue and unblocks your use case - I can worry about passing certs directly later.

alonsodomin commented 4 years ago

took me a little to be able to test this but I wanted to let you know that the latest change have worked

Lucretius commented 4 years ago

Great, thanks for submitting this PR and getting the provider functional on the registry!