awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
636 stars 300 forks source link

ANTLR tool ParseCancellationException error when load glue catalog #85

Open DevToKKi opened 3 years ago

DevToKKi commented 3 years ago

Trying to load Glue catalog via local Glue dev environment using code from

%pyspark
import sys
from pyspark.context import SparkContext
from awsglue.context import GlueContext

glueContext = GlueContext(SparkContext.getOrCreate())
productDyF = glueContext.create_dynamic_frame.from_catalog(database = "sample_db", 
                                                           table_name = "product", 
                                                           transformation_ctx = "productDyF")

Getting the following error

Py4JJavaError: An error occurred while calling o260.getCatalogSource.
: org.antlr.v4.runtime.misc.ParseCancellationException: line 1:20 token recognition error at: '?'
    at com.amazonaws.services.glue.schema.io.ThrowingErrorListener.syntaxError(ThrowingErrorListener.java:15)
    at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
    at org.antlr.v4.runtime.Lexer.notifyListeners(Lexer.java:364)
    at org.antlr.v4.runtime.Lexer.nextToken(Lexer.java:144)
    at org.antlr.v4.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:169)
    at org.antlr.v4.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:152)
    at org.antlr.v4.runtime.BufferedTokenStream.consume(BufferedTokenStream.java:136)
    at org.antlr.v4.runtime.Parser.consume(Parser.java:571)
    at org.antlr.v4.runtime.Parser.match(Parser.java:203)
    at com.amazonaws.services.glue.schema.io.grammar.HiveSchemaParser.colTypeList(HiveSchemaParser.java:613)
    at com.amazonaws.services.glue.schema.io.grammar.HiveSchemaParser.structType(HiveSchemaParser.java:414)
    at com.amazonaws.services.glue.schema.io.grammar.HiveSchemaParser.dataType(HiveSchemaParser.java:136)
    at com.amazonaws.services.glue.schema.io.HiveFormatDeserializer.deserializeDataType(HiveFormatDeserializer.java:52)
    at com.amazonaws.services.glue.schema.io.HiveFormatDeserializer.deserializeDataTypeFromString(HiveFormatDeserializer.java:63)
    at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$$anonfun$getFieldsFromColumns$1.apply(DataCatalogWrapper.scala:241)
    at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$$anonfun$getFieldsFromColumns$1.apply(DataCatalogWrapper.scala:240)
    at scala.collection.immutable.List.map(List.scala:288)
    at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.getFieldsFromColumns(DataCatalogWrapper.scala:240)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.getFieldsFromColumns(DataCatalogWrapper.scala:94)
    at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.getSchema(DataCatalogWrapper.scala:245)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.getSchema(DataCatalogWrapper.scala:94)
    at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.catalogPartitionFromGluePartition(DataCatalogWrapper.scala:512)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.catalogPartitionFromGluePartition(DataCatalogWrapper.scala:94)
    at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$13.apply(DataCatalogWrapper.scala:205)
    at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$13.apply(DataCatalogWrapper.scala:204)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.Iterator$class.foreach(Iterator.scala:891)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.getPartitions(DataCatalogWrapper.scala:204)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.go$1(DataCatalogWrapper.scala:213)
    at com.amazonaws.services.glue.util.DataCatalogWrapper.getAllPartitions(DataCatalogWrapper.scala:220)
    at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:256)
    at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:152)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:745)

(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError('An error occurred while calling o260.getCatalogSource.\n', JavaObject id=o279), <traceback object at 0x7fe0fbc97508>)
maxmargolin commented 1 year ago

any resolution? 😅 @DevToKKi