boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
8.97k stars 1.86k forks source link

Listing all buckets do not work on Gitlab CI #3337

Closed etienne-monier closed 1 year ago

etienne-monier commented 2 years ago

Describe the bug

I want to make functional tests for my S3-dependent app. To that end, I deploy a local minio docker image and I define a pytest fixture creating a bucket if missing. This works on my local laptop, but not on Gitlab CI.

Expected Behavior

I expect my pytest fixture to look if the desired bucket exists.

Current Behavior

Here are the gitlab CI logs:

Running with gitlab-runner 13.9.0 (2ebc4dc4)
  on #1 ErsT9zHm
Preparing the "docker" executor 00:03
Using Docker executor with image python:3.9-bullseye ...
Starting service minio/minio:latest ...
Pulling docker image minio/minio:latest ...
Using docker image sha256:32e2cbe1ecdedc708cb7e94964078a46183b1dd91d02cfae7e1215dbaaf47226 for minio/minio:latest with digest minio/minio@sha256:d065effe83ff4e8158c10f98721e6d3db70cc9cea8335761845ece61f16ccabc ...
Waiting for services to be up and running...
Authenticating with credentials from $DOCKER_AUTH_CONFIG
Pulling docker image python:3.9-bullseye ...
Using docker image sha256:bc0c5fcd8e13cdfdc1fa5c96926fd242e25a958ef3a7fcb6c18b2db7423ac849 for python:3.9-bullseye with digest python:3.9-bullseye@sha256:b94cc22fa2a9d2814491a29572b57897ce90fafa62f6c8cc6b85d2aa76069f46 ...
Preparing environment 00:01
Running on runner-erst9zhm-project-6956-concurrent-0 via tu-p02.sis.xxx.fr...
Getting source from Git repository 00:01
Fetching changes with git depth set to 50...
Initialized empty Git repository in test-bug-boto3-on-ci/.git/
Created fresh repository.
Checking out b7474d7e as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:10
Using docker image sha256:bc0c5fcd8e13cdfdc1fa5c96926fd242e25a958ef3a7fcb6c18b2db7423ac849 for python:3.9-bullseye with digest python:3.9-bullseye@sha256:b94cc22fa2a9d2814491a29572b57897ce90fafa62f6c8cc6b85d2aa76069f46 ...
$ export no_proxy=$no_proxy,gitea
$ export NO_PROXY=$no_proxy
$ pip install boto3 pytest
Collecting boto3
  Downloading boto3-1.24.27-py3-none-any.whl (132 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.5/132.5 KB 850.4 kB/s eta 0:00:00
Requirement already satisfied: pytest in /usr/local/lib/python3.9/site-packages (7.1.1)
Collecting botocore<1.28.0,>=1.27.27
  Downloading botocore-1.27.27-py3-none-any.whl (9.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.0/9.0 MB 6.6 MB/s eta 0:00:00
Collecting s3transfer<0.7.0,>=0.6.0
  Downloading s3transfer-0.6.0-py3-none-any.whl (79 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.6/79.6 KB 10.7 MB/s eta 0:00:00
Collecting jmespath<2.0.0,>=0.7.1
  Downloading jmespath-1.0.1-py3-none-any.whl (20 kB)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.9/site-packages (from pytest) (21.4.0)
Requirement already satisfied: iniconfig in /usr/local/lib/python3.9/site-packages (from pytest) (1.1.1)
Requirement already satisfied: packaging in /usr/local/lib/python3.9/site-packages (from pytest) (20.9)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.9/site-packages (from pytest) (2.0.1)
Requirement already satisfied: py>=1.8.2 in /usr/local/lib/python3.9/site-packages (from pytest) (1.11.0)
Requirement already satisfied: pluggy<2.0,>=0.12 in /usr/local/lib/python3.9/site-packages (from pytest) (1.0.0)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/local/lib/python3.9/site-packages (from botocore<1.28.0,>=1.27.27->boto3) (1.26.9)
Collecting python-dateutil<3.0.0,>=2.1
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 KB 9.9 MB/s eta 0:00:00
Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.9/site-packages (from packaging->pytest) (3.0.8)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.28.0,>=1.27.27->boto3) (1.16.0)
Installing collected packages: python-dateutil, jmespath, botocore, s3transfer, boto3
Successfully installed boto3-1.24.27 botocore-1.27.27 jmespath-1.0.1 python-dateutil-2.8.2 s3transfer-0.6.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
$ pytest -v
============================= test session starts ==============================
platform linux -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: test-bug-boto3-on-ci
collecting ... collected 1 item
tests/test_s3.py::test_s3 FAILED                                         [100%]
=================================== FAILURES ===================================
___________________________________ test_s3 ____________________________________
    def test_s3():
        """Create a bucket in local minio instance"""
        # Get the resource
        s3 = boto3.resource(
            "s3",
            aws_access_key_id="minioadmin",
            aws_secret_access_key="minioadmin",
            endpoint_url=os.getenv("TEST_ENDPOINT"),
        )

        # Create bucket
>       if s3.Bucket(BUCKET_NAME) not in s3.buckets.all():
tests/test_s3.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/local/lib/python3.9/site-packages/boto3/resources/collection.py:81: in __iter__
    for page in self.pages():
/usr/local/lib/python3.9/site-packages/boto3/resources/collection.py:166: in pages
    pages = [getattr(client, self._py_operation_name)(**params)]
/usr/local/lib/python3.9/site-packages/botocore/client.py:508: in _api_call
    return self._make_api_call(operation_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/client.py:898: in _make_api_call
    http, parsed_response = self._make_request(
/usr/local/lib/python3.9/site-packages/botocore/client.py:921: in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:119: in make_request
    return self._send_request(request_dict, operation_model)
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:202: in _send_request
    while self._needs_retry(
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:354: in _needs_retry
    responses = self._event_emitter.emit(
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:412: in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:256: in emit
    return self._emit(event_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:239: in _emit
    response = handler(**kwargs)
/usr/local/lib/python3.9/site-packages/botocore/utils.py:1579: in redirect_from_error
    new_region = self.get_bucket_region(bucket, response)
/usr/local/lib/python3.9/site-packages/botocore/utils.py:1638: in get_bucket_region
    response = self._client.head_bucket(Bucket=bucket)
/usr/local/lib/python3.9/site-packages/botocore/client.py:508: in _api_call
    return self._make_api_call(operation_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/client.py:878: in _make_api_call
    request_dict = self._convert_to_request_dict(
/usr/local/lib/python3.9/site-packages/botocore/client.py:936: in _convert_to_request_dict
    api_params = self._emit_api_params(
/usr/local/lib/python3.9/site-packages/botocore/client.py:969: in _emit_api_params
    self.meta.events.emit(
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:412: in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:256: in emit
    return self._emit(event_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:239: in _emit
    response = handler(**kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
params = {'Bucket': None}
kwargs = {'context': {'auth_type': None, 'client_config': <botocore.config.Config object at 0x7f8a162cb3a0>, 'client_region': '...reaming_input': False}, 'event_name': 'before-parameter-build.s3.HeadBucket', 'model': OperationModel(name=HeadBucket)}
bucket = None
    def validate_bucket_name(params, **kwargs):
        if 'Bucket' not in params:
            return
        bucket = params['Bucket']
>       if not VALID_BUCKET.search(bucket) and not VALID_S3_ARN.search(bucket):
E       TypeError: expected string or bytes-like object
/usr/local/lib/python3.9/site-packages/botocore/handlers.py:270: TypeError
=========================== short test summary info ============================
FAILED tests/test_s3.py::test_s3 - TypeError: expected string or bytes-like o...
============================== 1 failed in 0.65s ===============================
Cleaning up file based variables 00:01
ERROR: Job failed: exit code 1

Reproduction Steps

Here is the full minimal example.

Required packages

boto3 and pytest for python docker installation is required

File contents

File tests/test_s3.py

#!/bin env python
# pylint: disable=redefined-outer-name

import boto3

import os

BUCKET_NAME = "mybucket"
"""The test bucket name"""

def test_s3():
    """Create a bucket in local minio instance"""
    # Get the resource
    s3 = boto3.resource(
        "s3",
        aws_access_key_id="minioadmin",
        aws_secret_access_key="minioadmin",
        endpoint_url=os.getenv("TEST_ENDPOINT"),
    )

    # Create bucket
    if s3.Bucket(BUCKET_NAME) not in s3.buckets.all():
        s3.create_bucket(Bucket=BUCKET_NAME)

    assert True

File Makefile

minio-run:
    @docker run -p 9000:9000 \
                -p 9001:9001 \
                -tid \
                --rm \
                --name minio \
                minio/minio server /data --console-address ":9001"

minio-rm:
    @docker rm -f minio

minio-restart: minio-rm minio-run

test:
    TEST_ENDPOINT=http://localhost:9000 pytest -v

File .gitlab-ci.yml

stages:
  - test

pytest-functional:
  stage: test
  image: python:3.9-bullseye
  services:
    - name: minio/minio
      command: ["server", "/minio"]
      alias: minio
  variables:
    TEST_ENDPOINT: http://minio:9000
  script:
    - export no_proxy=$no_proxy,gitea
    - export NO_PROXY=$no_proxy
    - pip install boto3 pytest
    - pytest -v

To test locally

Run make minio-restart to start minio service and run make test to launch test.

To test on gitlab CI

The problem is that a gitlab CI runner is required. To test it, simply create a repository with the above files and push it on gitlab. Look at the pipeline logs to have the error.

Possible Solution

No idea :(

Additional Information/Context

To launch a gitlab instance with runner, I found that docker-compose file. To get the root user password, type sudo docker exec -it gitlab grep 'Password:' /etc/gitlab/initial_root_password (cf this doc)

Yes, that's complicated.

SDK version used

boto3==1.24.27

Environment details (OS name and version, etc.)

Xubuntu 20.04 and GitLab Community Edition 13.9.7

nateprewitt commented 2 years ago

Hi @etienne-monier,

Looking at your pytest output, you're passing in a BUCKET_NAME of None which isn't supported. That's what's causing the error. Is it actually a hardcoded value in your pipeline?

Look at the params on the first line below, the value for Bucket is what you're passing in immediately after your # Create bucket comment. You need to ensure it's a valid type before creating the bucket.

params = {'Bucket': None}
kwargs = {'context': {'auth_type': None, 'client_config': <botocore.config.Config object at 0x7f8a162cb3a0>, 'client_region': '...reaming_input': False}, 'event_name': 'before-parameter-build.s3.HeadBucket', 'model': OperationModel(name=HeadBucket)}
bucket = None
    def validate_bucket_name(params, **kwargs):
        if 'Bucket' not in params:
            return
        bucket = params['Bucket']
>       if not VALID_BUCKET.search(bucket) and not VALID_S3_ARN.search(bucket):
E       TypeError: expected string or bytes-like object
etienne-monier commented 2 years ago

I saw this strange error. This is hardcoded in tests/test_s3.py, which is cat-ed above.

nateprewitt commented 2 years ago

To make sure we're looking at the correct failure point, can you change the code to the following and report the stack trace for the failure.

     # Create bucket
+   current_bucket = 3.Bucket(BUCKET_NAME)
+   all_buckets = s3.buckets.all()  # This calls list_buckets under the hood
-    if s3.Bucket(BUCKET_NAME) not in s3.buckets.all():
+   if current_bucket not in all_buckets:
         s3.create_bucket(Bucket=BUCKET_NAME)

The only other case I can see us hitting this code path is if the configured test endpoint is returning an invalid response, causing the list buckets response to attempt creating bucket instances without a name (None).

etienne-monier commented 2 years ago

Here are the logs for the modified code:

$ pytest -v
============================= test session starts ==============================
platform linux -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /test-bug-boto3-on-ci
collecting ... collected 1 item
tests/test_s3.py::test_s3 FAILED                                         [100%]
=================================== FAILURES ===================================
___________________________________ test_s3 ____________________________________
    def test_s3():
        """Create a bucket in local minio instance"""
        # Get the resource
        s3 = boto3.resource(
            "s3",
            aws_access_key_id="minioadmin",
            aws_secret_access_key="minioadmin",
            endpoint_url=os.getenv("TEST_ENDPOINT"),
        )

        # Create bucket
        current_bucket = s3.Bucket(BUCKET_NAME)
        all_buckets = s3.buckets.all()  # This calls list_buckets under the hood
>       if current_bucket not in all_buckets:
tests/test_s3.py:24: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/local/lib/python3.9/site-packages/boto3/resources/collection.py:81: in __iter__
    for page in self.pages():
/usr/local/lib/python3.9/site-packages/boto3/resources/collection.py:166: in pages
    pages = [getattr(client, self._py_operation_name)(**params)]
/usr/local/lib/python3.9/site-packages/botocore/client.py:508: in _api_call
    return self._make_api_call(operation_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/client.py:898: in _make_api_call
    http, parsed_response = self._make_request(
/usr/local/lib/python3.9/site-packages/botocore/client.py:921: in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:119: in make_request
    return self._send_request(request_dict, operation_model)
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:202: in _send_request
    while self._needs_retry(
/usr/local/lib/python3.9/site-packages/botocore/endpoint.py:354: in _needs_retry
    responses = self._event_emitter.emit(
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:412: in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:256: in emit
    return self._emit(event_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:239: in _emit
    response = handler(**kwargs)
/usr/local/lib/python3.9/site-packages/botocore/utils.py:1579: in redirect_from_error
    new_region = self.get_bucket_region(bucket, response)
/usr/local/lib/python3.9/site-packages/botocore/utils.py:1638: in get_bucket_region
    response = self._client.head_bucket(Bucket=bucket)
/usr/local/lib/python3.9/site-packages/botocore/client.py:508: in _api_call
    return self._make_api_call(operation_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/client.py:878: in _make_api_call
    request_dict = self._convert_to_request_dict(
/usr/local/lib/python3.9/site-packages/botocore/client.py:936: in _convert_to_request_dict
    api_params = self._emit_api_params(
/usr/local/lib/python3.9/site-packages/botocore/client.py:969: in _emit_api_params
    self.meta.events.emit(
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:412: in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:256: in emit
    return self._emit(event_name, kwargs)
/usr/local/lib/python3.9/site-packages/botocore/hooks.py:239: in _emit
    response = handler(**kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
params = {'Bucket': None}
kwargs = {'context': {'auth_type': None, 'client_config': <botocore.config.Config object at 0x7f7045d5a670>, 'client_region': '...reaming_input': False}, 'event_name': 'before-parameter-build.s3.HeadBucket', 'model': OperationModel(name=HeadBucket)}
bucket = None
    def validate_bucket_name(params, **kwargs):
        if 'Bucket' not in params:
            return
        bucket = params['Bucket']
>       if not VALID_BUCKET.search(bucket) and not VALID_S3_ARN.search(bucket):
E       TypeError: expected string or bytes-like object
/usr/local/lib/python3.9/site-packages/botocore/handlers.py:270: TypeError
=========================== short test summary info ============================
FAILED tests/test_s3.py::test_s3 - TypeError: expected string or bytes-like o...
============================== 1 failed in 0.66s ===============================
Cleaning up file based variables 00:01
ERROR: Job failed: exit code 1
nateprewitt commented 2 years ago

Thanks for checking, so this is the response from the Minio server. It's not returning valid contents for our list buckets call that is invoked by .all(). When parsing the response, we appear to be encountering a bucket without a name which can't exist.

I'm not sure there's much we can do in this case, you may need to reach out to Minio to understand their response format and determine where they may be deviating from S3.

etienne-monier commented 2 years ago

Ok, after some searches, the problem comes from my company proxy.

The problem is that it seems boto3 does not catch no_proxy environment variable. I must specify it as a client (or ressource) configuration.

The solution is to modify the test module into

#!/bin env python
# pylint: disable=redefined-outer-name

import boto3
from botocore.config import Config

import os

BUCKET_NAME = "mybucket"
"""The test bucket name"""

def test_s3():
    """Create a bucket in local minio instance"""

    extra_config = {}

    if os.getenv("BOTO3_EMPY_PROXY") is not None:
        extra_config["config"] = Config(proxies={})

    # Get the resource
    s3 = boto3.resource(
        "s3",
        aws_access_key_id="admin",
        aws_secret_access_key="password",
        endpoint_url=os.getenv("TEST_ENDPOINT"),
        **extra_config,
    )

    # Create bucket
    current_bucket = s3.Bucket(BUCKET_NAME)
    all_buckets = s3.buckets.all()  # This calls list_buckets under the hood
    if current_bucket not in all_buckets:
        s3.create_bucket(Bucket=BUCKET_NAME)

    assert True

I finally don't know if this is desired. Let me know.

aBurmeseDev commented 1 year ago

Hi @etienne-monier - Just checking in here to see if you've found a workaround or solution to this. Please let us know if there's anything else we could assist you further.

etienne-monier commented 1 year ago

The workaround was given in my previous comment. That's to explicitly tell boto3 not to consider proxies. This is due to non-considering NO_PROXY environment variable.

tim-finnigan commented 1 year ago

Here is the boto3 documentation on proxies for more reference: https://boto3.amazonaws.com/v1/documentation/api/1.18.55/guide/configuration.html#using-proxies

We are now converting guidance issues to GitHub Discussions so I will convert this issue. Please let us know if you had any additional feedback on this.