Affected Resource: aws_glue_catalog_table
By defining only the schema_version_id in schema_reference block inside the aws_glue_catalog_table configuration according to this Hashicorp documentation, a bug during our pipeline occured.
According to the documentation it should be possible to only use the schema_version_id instead of schema_id and schema_version_number in the schema_reference block.
Due to this bug the pipeline cannot run successfully.
Expected Behavior
The correct schema and the correct schema version should be referenced successful to the specific glue table and the pipeline should run through successfully.
Actual Behavior
The pipeline fails.
Relevant Error/Panic Output Snippet
The error log caused by the terraform error:
│ Error: Missing required argument
│
│ on ../../modules/glue/table/main.tf line 101, in resource "aws_glue_catalog_table" "test_table_linking_issue":
│ 101: schema_reference {
│
│ The argument "schema_version_number" is required, but no definition was
│ found.
╵
___
___
When trying to provide both schema_version_id and schema_version_number this error occurs caused by AWS:
Error: updating Glue Catalog Table (771887822597:testing_data_lake:test_table_linking_issue): InvalidInputException: No other input parameters can be specified when fetching by SchemaVersionId.
│ {
│ RespMetadata: {
│ StatusCode: 400,
│ RequestID: "064a8971-b702-4d59-a005-cf765486f5f0"
│ },
│ Message_: "No other input parameters can be specified when fetching by SchemaVersionId."
│ }
│
│ with module.main.module.glue.module.table.aws_glue_catalog_table.test_table_linking_issue,
│ on ../../modules/glue/table/main.tf line 85, in resource "aws_glue_catalog_table" "test_table_linking_issue":
│ 85: resource "aws_glue_catalog_table" "test_table_linking_issue" {
Generally there is not a perfect connection between aws glue schemas and aws glue tables when using terraform.
Example problems:
When changing the aws glue schema and thus creating a new schema version, a related table does not update its version automatically and uses the latest schema version. Already open issue to this: https://github.com/hashicorp/terraform-provider-aws/issues/25774
Creating a schema manually in AWS Console and referring it to the aws glue table creates a different display of the schema, when clicking on the button "Edit Schema" under the aws glue table in the aws glue console, compared to creating an aws glue table and schema only with terraform. The displayed schema when clicking the button "Edit Schema" looks differently when comparing both ways of creating a table that references to an aws glue schema.
The data validation functionality in Kinesis Firehose delivery stream does not work, because, I believe, the correct referencing between glue table and schemas does not exist and thus the patterns existing inside the schemas cannot be recognised by firehose and thus no data validation is possible.
Some pictures to the explained problem when comparing the connection between schema and table depending on their creation inside terraform or inside aws console:
Created in Terraform:
Created in aws glue console:
I highly recommend talking a closer look at this issue as I believe that those mentioned missing functionalities above are created by a weak or missing link between schemas and tables.
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
Volunteering to Work on This Issue
If you are interested in working on this issue, please leave a comment.
If this would be your first contribution, please review the contribution guide.
Terraform Core Version
1.5.7
AWS Provider Version
hashicorp/aws 5.21.0, hashicorp/archive 2.4.0
Affected Resource(s)
Affected Resource: aws_glue_catalog_table By defining only the schema_version_id in schema_reference block inside the aws_glue_catalog_table configuration according to this Hashicorp documentation, a bug during our pipeline occured.
According to the documentation it should be possible to only use the schema_version_id instead of schema_id and schema_version_number in the schema_reference block.
Due to this bug the pipeline cannot run successfully.
Expected Behavior
The correct schema and the correct schema version should be referenced successful to the specific glue table and the pipeline should run through successfully.
Actual Behavior
The pipeline fails.
Relevant Error/Panic Output Snippet
When trying to provide both schema_version_id and schema_version_number this error occurs caused by AWS:
Terraform Configuration Files
Steps to Reproduce
Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
An issue maybe relating to the same general terraform problem might be this one: https://github.com/hashicorp/terraform-provider-aws/issues/25774
Generally there is not a perfect connection between aws glue schemas and aws glue tables when using terraform. Example problems:
Some pictures to the explained problem when comparing the connection between schema and table depending on their creation inside terraform or inside aws console:
Created in Terraform:
Created in aws glue console:
I highly recommend talking a closer look at this issue as I believe that those mentioned missing functionalities above are created by a weak or missing link between schemas and tables.
Would you like to implement a fix?
None