awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 299 forks source link

Packet for query is too large in glue 4.0 while writing #192

Closed andcau closed 1 year ago

andcau commented 1 year ago

Hi I'm getting the following error in glue 3.0 and 4.0 while writing on Redshift: "An error occurred while calling o441.count. Packet for query is too large (5,584,499 > 4,194,304). You can change this value on the server by setting the 'max_allowed_packet' variable." Everything works on glue 2.0. The same happens to me on some S3 write jobs. I'm suspecting the mysql driver is the problem even though I don't expect it to be used in these cases Is there a way to set that variable via pyspark to work around the problem ?

andcau commented 1 year ago

The problem was actually in reading. To fix, the parameter on the mysql server was changed