ANTLR Tool version 4.3 used for code generation does not match the current runtime version 4.7.220/07/25 19:09:21 INFO DataSink: Table input_mixed_types already exists in database glue_lab with catalogId of
20/07/25 19:09:21 INFO DataSink: Failed to retrieve created table input_mixed_types in database glue_lab after job run with catalogId
20/07/25 19:09:21 INFO DataSink: org.antlr.v4.runtime.misc.ParseCancellationException: line 1:0 no viable alternative at input 'long'
org.antlr.v4.runtime.misc.ParseCancellationException: line 1:0 no viable alternative at input 'long'
at com.amazonaws.services.glue.schema.io.ThrowingErrorListener.syntaxError(ThrowingErrorListener.java:15)
at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
at org.antlr.v4.runtime.DefaultErrorStrategy.reportNoViableAlternative(DefaultErrorStrategy.java:310)
at org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:136)
at com.amazonaws.services.glue.schema.io.grammar.HiveSchemaParser.dataType(HiveSchemaParser.java:186)
at com.amazonaws.services.glue.schema.io.HiveFormatDeserializer.deserializeDataType(HiveFormatDeserializer.java:52)
at com.amazonaws.services.glue.schema.io.HiveFormatDeserializer.deserializeDataTypeFromString(HiveFormatDeserializer.java:63)
at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$$anonfun$getFieldsFromColumns$1.apply(DataCatalogWrapper.scala:241)
at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$$anonfun$getFieldsFromColumns$1.apply(DataCatalogWrapper.scala:240)
at scala.collection.immutable.List.map(List.scala:278)
at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.getFieldsFromColumns(DataCatalogWrapper.scala:240)
at com.amazonaws.services.glue.util.DataCatalogWrapper.getFieldsFromColumns(DataCatalogWrapper.scala:94)
at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.getSchema(DataCatalogWrapper.scala:245)
at com.amazonaws.services.glue.util.DataCatalogWrapper.getSchema(DataCatalogWrapper.scala:94)
at com.amazonaws.services.glue.util.DataCatalogWrapperUtils$class.catalogTableFromGlueTable(DataCatalogWrapper.scala:482)
at com.amazonaws.services.glue.util.DataCatalogWrapper.catalogTableFromGlueTable(DataCatalogWrapper.scala:94)
at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$1.apply(DataCatalogWrapper.scala:102)
at com.amazonaws.services.glue.util.DataCatalogWrapper$$anonfun$1.apply(DataCatalogWrapper.scala:97)
at scala.util.Try$.apply(Try.scala:191)
at com.amazonaws.services.glue.util.DataCatalogWrapper.getTable(DataCatalogWrapper.scala:97)
at com.amazonaws.services.glue.DataSink$$anonfun$1.apply$mcV$sp(DataSink.scala:412)
at com.amazonaws.services.glue.DataSink$$anonfun$1.apply(DataSink.scala:412)
at com.amazonaws.services.glue.DataSink$$anonfun$1.apply(DataSink.scala:412)
at scala.util.Try$.apply(Try.scala:191)
at com.amazonaws.services.glue.DataSink$.getCatalogTableWithSinkElseCreateTable(DataSink.scala:411)
at com.amazonaws.services.glue.DataSink$.forwardPotentialDynamicFrameToCatalog(DataSink.scala:207)
at com.amazonaws.services.glue.DataSink$.forwardPotentialDynamicFrameToCatalog(DataSink.scala:167)
at com.amazonaws.services.glue.sinks.HadoopDataSink$$anonfun$writeDynamicFrame$1.apply(HadoopDataSink.scala:237)
at com.amazonaws.services.glue.sinks.HadoopDataSink$$anonfun$writeDynamicFrame$1.apply(HadoopDataSink.scala:141)
at com.amazonaws.services.glue.util.FileSchemeWrapper$$anonfun$executeWithQualifiedScheme$1.apply(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.util.FileSchemeWrapper$$anonfun$executeWithQualifiedScheme$1.apply(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.util.FileSchemeWrapper.executeWith(FileSchemeWrapper.scala:57)
at com.amazonaws.services.glue.util.FileSchemeWrapper.executeWithQualifiedScheme(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.sinks.HadoopDataSink.writeDynamicFrame(HadoopDataSink.scala:140)
at com.amazonaws.services.glue.DataSink.pyWriteDynamicFrame(DataSink.scala:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Traceback (most recent call last):
File "/Users/paulburridge/Projects/Glue/glue_script_ingestion.py", line 115, in <module>
sink.writeFrame(subset_df)
File "/Users/paulburridge/Projects/Glue/aws-glue-libs/PyGlue.zip/awsglue/data_sink.py", line 31, in writeFrame
File "/Users/paulburridge/Projects/Glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/Users/paulburridge/Projects/Glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "/Users/paulburridge/Projects/Glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o48.pyWriteDynamicFrame.
: scala.MatchError: (null,false) (of class scala.Tuple2)
at com.amazonaws.services.glue.DataSink$.forwardPotentialDynamicFrameToCatalog(DataSink.scala:207)
at com.amazonaws.services.glue.DataSink$.forwardPotentialDynamicFrameToCatalog(DataSink.scala:167)
at com.amazonaws.services.glue.sinks.HadoopDataSink$$anonfun$writeDynamicFrame$1.apply(HadoopDataSink.scala:237)
at com.amazonaws.services.glue.sinks.HadoopDataSink$$anonfun$writeDynamicFrame$1.apply(HadoopDataSink.scala:141)
at com.amazonaws.services.glue.util.FileSchemeWrapper$$anonfun$executeWithQualifiedScheme$1.apply(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.util.FileSchemeWrapper$$anonfun$executeWithQualifiedScheme$1.apply(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.util.FileSchemeWrapper.executeWith(FileSchemeWrapper.scala:57)
at com.amazonaws.services.glue.util.FileSchemeWrapper.executeWithQualifiedScheme(FileSchemeWrapper.scala:63)
at com.amazonaws.services.glue.sinks.HadoopDataSink.writeDynamicFrame(HadoopDataSink.scala:140)
at com.amazonaws.services.glue.DataSink.pyWriteDynamicFrame(DataSink.scala:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
I think this is a known bug (ran into it myself). If you make sure not to use type long in your schema it works. I reported this to AWS some weeks ago an supposedly there is a fix being worked on.
Trying to update Glue catalog via local Glue dev environment using example code from https://docs.aws.amazon.com/glue/latest/dg/update-from-job.html
Getting the following error