Closed Romanmc72 closed 5 months ago
Thank you for the very detailed bug description @Romanmc72,
Would you be willing to contribute and help us with this yourself? Check out our contributing guide if you're interested 🙂
Happy to help! I will take a look there and see where I can get started.
Let me know how the PR looks, happy to change if need be.
The PR raised was closed a year ago as this bug was deemed not required due to the best practice of always specifying the order of columns in any SQL one writes. This is a fair callout and as such this is not really a major bug that requires a fix.
I do want to make some additional callouts on the above use of Promise.all()
and executeStatement()
to run queries in parallel for a given handler. It may not be an issue in this particular case based on best practices but it does cause havoc in another, let me briefly explain relying on the reader to leverage the above examples to see what I mean. The case where we still want to synchronize and or batch some SQL as a transaction is around the UserPrivileges construct. The case where I have encountered issues is where I have more than one user created who need access to a particular table. If I grant access to those users to the same database object during the same cdk deploy
then I will almost always get a concurrent transaction conflict. Why is this? Redshift (and underlying postgres) acquire a lock on the permissions for a particular table object when changing them. It grants the privilege by adding g a use to that permission then releases the lock and moves on. In the case of our handler, permissions are attempting to be granted in parallel against the same object so the first one arrives and acquires the lock and the subsequent calls attempt to do the same but fail due to a concurrent transaction conflict. This causes the update to fail and to rollback often with an update rollback failed because the rollback just does the inverse (if grant then it tries revoke or vice versa) which faces the same issue. Moving those operations to a batchExecuteStatement()
call would eliminate that error entirely. As I had originally suggested moving everything to the batch statement, I now realize that would be a bad idea for the following reason. The batch execute runs in a transaction but certain DDL statements cannot be run inside a transaction. As such I think the CDK should support both the parallel and batch statement approach for different scenarios because they solve different problems. I would be happy to revise my PR or submit a new one with only these described changes in mind for the user privilege construct.
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
Describe the bug
When using the
Table
construct in the@aws-cdk/aws-redshift
module, if you have a table that already exists, then you attempt to modify that table by adding more than one column, then the ordering of those columns is not guaranteed. It works correctly when creating tables because the entire operation is run in oneCREATE TABLE
SQL statement, however the addition of more than 1 new column at a time will create the columns out of order in a potentially erratic and unpredictable manner. The reason for this is because the addition of new columns is done with parallel asynchronous calls ofALTER TABLE <tableName> ADD COLUMN <columnName> <dataType>
SQL statements. This can be seen here executed where theALTER TABLE
statements are added to an array and then here where those statements are executed in an asynchronous fashion against the redshift data API. So whichever request arrives first to the API wins instead of the statements being executed as the developer would expect, being the order they specified in their CDK code.Expected Behavior
As a user of the Typescript version of this library, Arrays in typescript come ordered (as they do in most languages) so when I have an ordered array of column definitions, my expectation is that that order is preserved when applied to my table and that the columns in my CDK code match the columns in my Redshift table in terms of data type and ordering.
Current Behavior
Currently, if I have a table and add more than one new column to that table at a time, the order of those columns will at random not match with the order I place them in my
tableColumns
array, even between deployments to different isolated environments that utilized the same templates.Reproduction Steps
If you only add 2 or 3 columns then there is a chance that they just happen to resolve in the correct order. I have added 10 here and am able to reproduce this bug every time I run it (ran 4 times). The code below worked for me as-is.
npm run build && cdk deploy RedshiftStack
4.. Add 10 new columns to that original table.
npm run build && cdk deploy RedshiftStack
You would expect to query the redshift database and see the columns in the numeric order matching what is laid out above in the
tableColumns
array, however you will get something more like this:Feel free to edit the stack, remove the new columns, then read-add them and get a whole new order! I did exactly that and on the second attempt got the following column order:
Possible Solution
There are 2 ways to solve this as I see it.
executeStatement()
performed here.So instead of:
it'd be more like:
This will take a while to execute and require many round trips to the API, but it would "work".
Table
construct only leverages the ExecuteStatement endpoint on the redshift-data API. The shape of the output and response from the batch endpoint matches that of the regular execute endpoint:and the subsequent call made to waitForStatementComplete could remain unchanged as well. The shape of the response on DescribeStatement changes slightly for the case when one describes a batch execute statement, but not in a way that would break the API response to this function.
The only difference would be the existence of the
SubStatements
attribute on a batch statement. In effect though, you could just replace the use of theexecuteStatement
endpoint entirely with thebatchExecuteStatement
one, and change one of the input properties fromSql: string
toSqls: string[]
and viola, done. Of course you would need to update everywhere that callsexecuteStatement
to pass in an array of strings not a string, but this would be pretty trivial to fix. The BatchExecuteStatement supports sending 1 or more SQL statements, so it would be compatible with the rest of the single-statement redshift-data API calls as well.Additional Information/Context
Happy to raise the Pull Request here for this one myself. Would love to know your thoughts on switching from the ExecuteStatement command to the BatchExecuteStatement command as that seems like the best way forward here in my opinion.
Here is my
package.json
if that helpsI did have
eslint
set up as you can see above but that is not required for reproducing this bug.CDK CLI Version
2.1.0 (build f4f18b1)
Framework Version
No response
Node.js Version
v16.15.0
OS
macOS 12.3.1 (21E258)
Language
Typescript
Language Version
TypeScript (3.9.10)
Other information
await Promise.all(array.map())
. (there are many such articles out there if you do some light googling)here is a small nodejs program you can run to see this mess in action for yourself: