aws-cloudformation / cloudformation-coverage-roadmap

The AWS CloudFormation Public Coverage Roadmap
https://aws.amazon.com/cloudformation/
Creative Commons Attribution Share Alike 4.0 International
1.11k stars 54 forks source link

Glue Iceberg Table: Table is broken after any update #1919

Open padaszewski opened 8 months ago

padaszewski commented 8 months ago

Name of the resource

AWS::Glue::Table

Resource Name

No response

Issue Description

Hi there! When I try to update something on my iceberg table, the update causes the table to break and the table format to disappear. Basically, it's no longer an iceberg table and no operations on the table are possible.

Expected Behavior

When I update the table, the update does not remove the table input and I can work with the iceberg table as I should.

Observed Behavior

Before the update (after initial deployment):

image

After any update:

image

Notice the table format prop. Table management prop is also away.

Athena before update: Zrzut ekranu 2024-02-2 o 14 23 06

Athena after update: image image

Test Cases

Simple CDK Stack to reproduce this behavior (uncomment one column to update, or do any other update):

export class CdkTestingStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const myTestDatabase = new CfnDatabase(this, 'myTestDatabase', {
      catalogId: Aws.ACCOUNT_ID,
      databaseInput: {
        name: 'mytestdatabase'
      }
    })

    const myLocationBucket = new Bucket(this, 'myLocationBucket', {
      removalPolicy: RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    })

    const myTestTable = new CfnTable(this, 'myTestTable', {
      databaseName: 'mytestdatabase',
      catalogId: Aws.ACCOUNT_ID,
      tableInput: {
        name: 'mytesttable',
        storageDescriptor: {
          columns: [
            {
              name: 'name',
              type: 'string'
            },
            // {
            //   name: 'ts',
            //   type: 'timestamp'
            // }
          ],
          location: `s3://${myLocationBucket.bucketName}/mytesttable/`,
        },
        tableType: 'EXTERNAL_TABLE',
      },
      openTableFormatInput: {
        icebergInput: {
          metadataOperation: 'CREATE',
          version: '2'
        }
      }
    })

  }
}

Other Details

No response

padaszewski commented 8 months ago

@sfgarcia @oleksiiburov @dmschauer Tagging You, as You were active on other Iceberg issues. Hope you don't mind. Maybe You have some workaround other than creating this with Athena query.

dmschauer commented 8 months ago

@padaszewski My workaround would be indeed to use a custom resource with the Athena API (issuing queries via awswrangler in a Lambda function). A custom implementation for creating the table and deleting the table is straight-forward. I already implemented such a custom resource. Covering schema changes to the existing table via this custom resource could also be implemented but it's more complex (would work by comparing existing columns and types to recently supplied columns and types and issuing corresponding ALTER TABLE statements). But I see you're looking for a solution that avoids Athena so I think that won't help here.

padaszewski commented 8 months ago

Thx @dmschauer for the reply. If AWS doesn't ship this along with the iceberg table partitioning feature request, then there is currently no other way than using athena with CR on deployment to achieve this. Iceberg tables are critical for our use case and it's sad that such a great thing is not well supported via IaC.

sfgarcia commented 7 months ago

Hi @padaszewski. I would also like that AWS fully supported managing Iceberg tables (create/update) through IaC. At my team we don't have our Iceberg tables as IaC (we create and update them with Athena queries) due to this limitation.

padaszewski commented 7 months ago

Hi @sfgarcia, thx for the reply. We decided to do the same, but with CustomResources as IaC.

svdgraaf commented 5 months ago

Just a +1 here, this is still an issue. In addition, when creating a resource with a reference to a schema version, the columns do not appear to be loaded into the metadata file.

jhosmanfriasbravo commented 5 months ago

hey! +1 👀 👀 👀

blaxx commented 5 months ago

Same here, would love to be able to create/update partitioned Iceberg tables using the CDK.

cyberst commented 5 months ago

I would love to be able to create/update partitioned Iceberg tables using the CloudFormation/CDK too.

mehdimld commented 5 months ago

+1

ijtarano commented 4 months ago

+1 big concern for Cepsa's team...

emiliogarcia-cps commented 4 months ago

+1

jmartinez-cps commented 4 months ago

+1

armaseg commented 4 months ago

+1

FAGUILERAM2022 commented 4 months ago

+1

aitormagan commented 4 months ago

+1

JesusAndres2 commented 4 months ago

+1

etjess commented 4 months ago

+1

romancepsa commented 3 months ago

+1

Rizxcviii commented 3 months ago

+1

raycomh commented 2 months ago

+1

Smotrov commented 1 month ago

+1