awslabs / aws-athena-query-federation

The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.
Apache License 2.0
553 stars 290 forks source link

[BUG] glue parse error with connector dynamodb #664

Closed womblep closed 2 years ago

womblep commented 2 years ago

Describe the bug A glue table format with a column type of decimal or double causes a parse error and Athena uses the implied format. Changing the type to string parses correctly and Athena gets all the columns.

java.lang.RuntimeException: Parse error, expected < but found null at com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer.lexComplex(GlueFieldLexer.java:89) ~[task/:?] at com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer.lex(GlueFieldLexer.java:68) ~[task/:?] at com.amazonaws.athena.connectors.dynamodb.DynamoDBMetadataHandler.convertField(DynamoDBMetadataHandler.java:475) ~[task/:?] at com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler.doGetTable(GlueMetadataHandler.java:361) ~[task/:?] at com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler.doGetTable(GlueMetadataHandler.java:308) ~[task/:?] at com.amazonaws.athena.connectors.dynamodb.DynamoDBMetadataHandler.doGetTable(DynamoDBMetadataHandler.java:232) [task/:?] at com.amazonaws.athena.connector.lambda.handlers.MetadataHandler.doHandleRequest(MetadataHandler.java:250) [task/:?] at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:132) [task/:?] at com.amazonaws.athena.connector.lambda.handlers.CompositeHandler.handleRequest(CompositeHandler.java:100) [task/:?] at lambdainternal.EventHandlerLoader$2.call(EventHandlerLoader.java:903) [LambdaSandboxJava-byol.jar:?] at lambdainternal.AWSLambda.startRuntime(AWSLambda.java:349) [LambdaSandboxJava-byol.jar:?] at lambdainternal.AWSLambda.(AWSLambda.java:70) [LambdaSandboxJava-byol.jar:?] at java.lang.Class.forName0(Native Method) ~[?:1.8.0_312] at java.lang.Class.forName(Class.java:348) [?:1.8.0_312] at lambdainternal.LambdaRTEntry.main(LambdaRTEntry.java:150) [LambdaJavaRTEntry-byol.jar:?]

Connector Details (please complete the following information):

glue table properties

{ "StorageDescriptor": { "cols": { "FieldSchema": [ { "name": "date", "type": "string", "comment": "" }, { "name": "cli", "type": "string", "comment": "" }, { "name": "action", "type": "string", "comment": "" }, { "name": "dnis", "type": "string", "comment": "" }, { "name": "time", "type": "string", "comment": "" }, { "name": "interaction_id", "type": "string", "comment": "" }, { "name": "timestamp", "type": "decimal", "comment": "" }, { "name": "counter", "type": "string", "comment": "" }, { "name": "menu", "type": "string", "comment": "" }, { "name": "option", "type": "string", "comment": "" }, { "name": "fred", "type": "string", "comment": "" }, { "name": "flow", "type": "string", "comment": "" } ] }, "location": "arn:aws:dynamodb:ap-southeast-2:xxxxxxxxxxxx:table/tablename", "inputFormat": "", "outputFormat": "", "compressed": "false", "numBuckets": "-1", "SerDeInfo": { "name": "", "serializationLib": "", "parameters": {} }, "bucketCols": [], "sortCols": [], "parameters": { "rangeKey": "timestamp", "sizeKey": "238", "hashKey": "interaction_id", "UPDATED_BY_CRAWLER": "webexcc-crawler", "CrawlerSchemaSerializerVersion": "1.0", "recordCount": "3", "averageRecordSize": "79", "CrawlerSchemaDeserializerVersion": "1.0", "compressionType": "none", "classification": "dynamodb", "typeOfData": "table" }, "SkewedInfo": {}, "storedAsSubDirectories": "false" }, "parameters": { "rangeKey": "timestamp", "sizeKey": "238", "hashKey": "interaction_id", "UPDATED_BY_CRAWLER": "webexcc-crawler", "CrawlerSchemaSerializerVersion": "1.0", "recordCount": "3", "averageRecordSize": "79", "CrawlerSchemaDeserializerVersion": "1.0", "compressionType": "none", "classification": "dynamodb", "typeOfData": "table" } }

henrymai commented 2 years ago

This is fixed with this PR: https://github.com/awslabs/aws-athena-query-federation/pull/730

However note that we don't plan on cutting a new release any time soon, so please build and deploy this on your own if you need this right now.

henrymai commented 2 years ago

@womblep

We finally cut a release (2022.30.2) with this fix. Try it out and let us know if you have any problems.