getmoto / moto

A library that allows you to easily mock out tests based on AWS infrastructure.
http://docs.getmoto.org/en/latest/
Apache License 2.0
7.5k stars 2k forks source link

Dynamo import table tests failing on 5.0.9 #7786

Open mweinelt opened 3 weeks ago

mweinelt commented 3 weeks ago

Hi!

We are seeing a number of tests in tests/test_dynamodb/test_dynamodb_import_table.py reliably fail on moto 5.0.9 with boto3 1.34.129 on Python 3.11.9 and 3.12.4.

They've been introduced in https://github.com/getmoto/moto/commit/06d0b2a04bff119a967a154077683f70c4309988, but I've not tried 5.0.7 or 5.0.8.

FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_from_empty_s3_bucket - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_table_single_file_with_multiple_items - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_import_table_multiple_files - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_some_successfull_files_and_some_with_unknown_data - assert 0 == 1
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_only_process_file_with_prefix - AssertionError: assert 'FAILED' == 'COMPLETED'
FAILED tests/test_dynamodb/test_dynamodb_import_table.py::test_process_gzipped_file - AssertionError: assert 'FAILED' == 'COMPLETED'

Please see the complete test stacktraces below:

_______________________ test_import_from_empty_s3_bucket _______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

table_name = 'moto_test_183402'

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_from_empty_s3_bucket(table_name=None):
        client = boto3.client("dynamodb", region_name="us-east-1")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:108: AssertionError
______________ test_import_table_single_file_with_multiple_items _______________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_table_single_file_with_multiple_items():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        data = ""
        for i in range(5):
            data += (
                json.dumps({"Item": {"pk": {"S": f"msg{i}"}, "data": {"S": f"{uuid4()}"}}})
                + "\n"
            )
        for i in range(10, 15):
            data += json.dumps(
                {"Item": {"pk": {"S": f"msg{i}"}, "data": {"S": f"{uuid4()}"}}}
            )
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=data,
            Key=filename1,
        )

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:169: AssertionError
_______________________ test_import_table_multiple_files _______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_import_table_multiple_files():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )

        items_file2 = {"Item": {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}}
        filename2 = "completely_random_filename_without_extension"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:244: AssertionError
____________ test_some_successfull_files_and_some_with_unknown_data ____________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_some_successfull_files_and_some_with_unknown_data():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )

        items_file2 = {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}
        filename2 = "invaliddata"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

        assert import_details["ImportStatus"] == "FAILED"
>       assert import_details["ErrorCount"] == 1
E       assert 0 == 1

tests/test_dynamodb/test_dynamodb_import_table.py:310: AssertionError
______________________ test_only_process_file_with_prefix ______________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_only_process_file_with_prefix():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "yesdata.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file1),
            Key=filename1,
        )

        items_file2 = {"Item": {"pk": {"S": "msg2"}, "data": {"S": f"{uuid4()}"}}}
        filename2 = "nodata.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=json.dumps(items_file2),
            Key=filename2,
        )

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name, "S3KeyPrefix": "yes"},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="NONE",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:373: AssertionError
__________________________ test_process_gzipped_file ___________________________
[gw13] linux -- Python 3.12.4 /nix/store/2cvwgfc1cd1c2ii4b08r99m4wrdqqqdj-python3-3.12.4/bin/python3.12

    @pytest.mark.aws_verified
    @dynamodb_aws_verified(create_table=False)
    def test_process_gzipped_file():
        client = boto3.client("dynamodb", region_name="us-east-2")
        s3 = boto3.client("s3", region_name="us-east-1")

        s3_bucket_name = f"inttest{uuid4()}"
        table_name = "moto_test_" + str(uuid4())[0:6]

        s3.create_bucket(Bucket=s3_bucket_name)

        items_file1 = {"Item": {"pk": {"S": "msg1"}, "data": {"S": f"{uuid4()}"}}}
        filename1 = "data.json"
        s3.put_object(
            Bucket=s3_bucket_name,
            Body=gzip.compress(json.dumps(items_file1).encode("utf-8")),
            Key=filename1,
        )

        import_description = client.import_table(
            S3BucketSource={"S3Bucket": s3_bucket_name},
            InputFormat="DYNAMODB_JSON",
            InputCompressionType="GZIP",
            TableCreationParameters={
                "TableName": table_name,
                "AttributeDefinitions": [
                    {"AttributeName": "pk", "AttributeType": "S"},
                ],
                "KeySchema": [
                    {"AttributeName": "pk", "KeyType": "HASH"},
                ],
                "BillingMode": "PAY_PER_REQUEST",
            },
        )["ImportTableDescription"]

        import_details = wait_for_import(client, import_description)

>       assert import_details["ImportStatus"] == "COMPLETED"
E       AssertionError: assert 'FAILED' == 'COMPLETED'
E         
E         - COMPLETED
E         + FAILED

tests/test_dynamodb/test_dynamodb_import_table.py:429: AssertionError
bblommers commented 3 weeks ago

Hi @mweinelt, what command do you use to run the tests?

The gw13-part in the traceback looks like part of the xdist plugin, like they are run in parallel, and that isn't necessarily supported.

mweinelt commented 3 weeks ago

Yes, we are running it massively parallel. Up to 40 cores for a few of my machines. But I've also seen it with 6 cores on an 8700K.

python3 -m pytest -m "not network and not requires_docker" --dist loadscope --numprocesses=0
mweinelt commented 3 weeks ago

The only way to explicitly serialize a group of tests is through loadgroup (https://github.com/pytest-dev/pytest-xdist/issues/385#issuecomment-1304877301). But that would mean moving away from loadscope, which is probably undesirable.

bpandola commented 3 weeks ago

I had these exact same tests fail for me recently on my local machine. I destroyed/recreated my virtual env (Python 3.11.9) for moto (using make init) and all DynamoDB tests are again passing. I was not running with xdist but are you maybe caching your dependencies on CI or locally? One or more of them might be out of date or out of sync.