GoogleCloudPlatform / pubsub2inbox

Pubsub2Inbox is a versatile, multi-purpose tool to handle Pub/Sub messages and turn them into email, API calls, GCS objects, files or almost anything.
Apache License 2.0
34 stars 11 forks source link

Provide an option to disable SSL cert verification when connecting to remote server. #78

Open qingvincentyin opened 1 week ago

qingvincentyin commented 1 week ago

I followed:

My JFrog Artifactory is listening on a private IP only. It is an internal JFrog installation in an enterprise and not meant to be exposed on a public IP. As such, my JFrog's SSL cert (or, more accurately, the cert for its Nginx proxy) is a self-signed cert (or, equivalently, a cert signed by my enterprise's internal/private Root CA). It is not signed by a well-known Root CA. This is a legit usage of self-signed cert.

I got this error:

  Traceback (most recent call last):
    File "/workspace/main.py", line 768, in process_message_pipeline
      processor_variables = processor_instance.process()
    File "/workspace/processors/docker.py", line 95, in process
      source_registry = Registry(hostname=hostname,
    File "/workspace/_vendor/python_docker/registry.py", line 25, in __init__
      self.detect_authentication()
    File "/workspace/_vendor/python_docker/registry.py", line 29, in detect_authentication
      response = requests.get(f"{self.hostname}/v2/")
    File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/api.py", line 73, in get
      return request("get", url, params=params, **kwargs)
    File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/api.py", line 59, in request
      return session.request(method=method, url=url, **kwargs)
    File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
      resp = self.send(prep, **send_kwargs)
    File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
      r = adapter.send(request, **kwargs)
    File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/adapters.py", line 517, in send
      raise SSLError(e, request=request)
  requests.exceptions.SSLError: HTTPSConnectionPool(host='10.10.10.10', port=443): 
      Max retries exceeded with url: /v2/ 
      (Caused by SSLError(SSLCertVerificationError(1, 
          '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1007)')))

The cause is this line:

Now, requests.get() in general accepts an optional 2nd param such as:

requests.get("https://example.com", verify=False)

But your code doesn't pass any 2nd param; therefore, your code doesn't give the option to ignore SSL cert verification.

I recommend that your code be modified to accept the 2nd optional param verify=[True | False] so that the user has a choice.

P.S. Alternatively, if GCP Cloud Functions provides a way for a user to install an SSL cert, it would solve the problem, too. But I cannot find such a way.

rosmo commented 5 days ago

Could you give the code here a try: https://github.com/GoogleCloudPlatform/pubsub2inbox/pull/79

I added two flags: tls_verify for source repository and destination_tls_verify for target repository.

qingvincentyin commented 4 days ago

I'm very impressed with your fast help! Beyond my expectations, thank you!

Consider this line

I modified the surrounding Terraform so that the config YAML content now looks like this (notice the 2 lines of tls_verify: false):

# Synchronize changes from Artifactory Docker registry into Google Cloud
# Artifact Registry
pipeline:
  - type: processor.genericjson
  - type: output.logger
    config:
      message: |
        Webhook details: {{ data }}

  - type: processor.docker
    runIf: "{% if data.event_type == 'pushed' %}1{% endif %}"
    config:
      mode: image.copy
      hostname: "{{ data.jpd_origin }}"
      username: "******"
      password: "******"
      image: "{{ data.data.repo_key }}/{{ data.data.image_name }}"
      tag: "{{ data.data.tag }}"
      destination_hostname: "https://us-east4-docker.pkg.dev"
      destination_image: "my-********-prj/pubsub2inbox/{{ data.data.image_name }}"
      tls_verify: false

  - type: processor.docker
    runIf: "{% if data.event_type == 'deleted' %}1{% endif %}"
    config:
      mode: image.deleteversion
      hostname: "https://us-east4-docker.pkg.dev"
      image: "my-********-prj/pubsub2inbox/{{ data.data.image_name }}"
      tag: "{{ data.data.tag }}"
      tls_verify: false

Then, when your Cloud Function is triggered by JFrog's webhook, I got this error:

Traceback (most recent call last):
  File "/workspace/main.py", line 768, in process_message_pipeline
    processor_variables = processor_instance.process()
  File "/workspace/processors/docker.py", line 99, in process
    source_tls_verify = self._jinja_expand_bool(self['tls_verify'],
TypeError: 'DockerProcessor' object is not subscriptable

At this point, I need your help. It would take me too long to read and understand your Python program's logic in order to debug this. It'll be a lot easier (perhaps even obvious) for you to figure it out, I think.

Better yet, if you can add the necessary Terraform change (in /pubsub2inbox/examples/artifactory-to-artifact-registry/) to your PR so that it is a complete, end-to-end working code, it'll make life a lot easier for me and all future users.

P.S. I terraform destroy followed by terraform apply to ensure my installation is clean. I further verified in GCP Console that the deployed Cloud Functions is indeed the new version in your PR:

image

rosmo commented 4 days ago

Looks like I made a small mistake there (I have a bit of hard time testing as I don't have the test environment around anymore). I pushed a fix to it, would you be able to retry with that? If it doesn't still work, I'll try to find some time to set up a testing environment.

qingvincentyin commented 4 days ago

I now get this error:

Traceback (most recent call last):
  File "/workspace/main.py", line 768, in process_message_pipeline
    processor_variables = processor_instance.process()
  File "/workspace/processors/docker.py", line 107, in process
    source_registry = Registry(hostname=hostname,
  File "/workspace/_vendor/python_docker/registry.py", line 26, in __init__
    self.detect_authentication()
  File "/workspace/_vendor/python_docker/registry.py", line 32, in detect_authentication
    response = requests.get(f"{self.hostname}/v2/")
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.10/site-packages/requests/adapters.py", line 595, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='10.10.10.10', port=443): 
    Max retries exceeded with URL: 
            /v2/ (Caused by SSLError(SSLCertVerificationError(1, 
            '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1007)')))

Comparing that with my original stack trace (before your PR), I see the 2 stack traces differ in the bottom (most recent) call's line number. The original was adapter.py, line 517 while the newest is adapters.py, line 595. However, upon closer look, that might just be some small version difference in the library code (adapters.py). The 2 stack traces are in principle identical.

That implies that your PR didn't change the behaviour of this call:

On that thought, I took a closer look at your PR. Your PR tries to modify the session object obtained here:

However, I think that object is only used by and therefore only affects this call:

It isn't used by and therefore doesn't affect this call in our stack trace:

If I'm right, the solution I recommend is one of the following:

  1. Either: You rewrite the detect_authentication() method to use this format for the HTTP request: response = self.session.get(...)
  2. Or: You add a 2nd param like this: response = requests.get(..., verify=...)
rosmo commented 2 hours ago

I set up a test environment and did some fixes. Could you try the latest code from the PR?