Open anthonysena opened 3 years ago
I've done some testing with the v1.2 RedShift driver and that seems to be a work-around for now while we see what is happening with the v2 driver. We'll need to discuss putting this into v2.10.1 potentially.
@anthonysena did you just rolled back to last version that was used or different one?
I ultimately went back to the last driver that we had used - the changes are here for now: https://github.com/OHDSI/WebAPI/tree/issue-1939-redshift-driver-downgrade
@anthonysena @ssuvorov-fls worked on solution to fix the problem and go back to driver v1.0 but with IAM support. The functionality is available in "issue-1939-redshift-driver-downgrade-sdk" branch. The Redshift support was moved into separate profile "webapi-redshift", that's why you should mention in initialization and install steps.
We do the following steps in our environment
mvn initialize -Dredshift.classpath=/var/local/drivers/redshift-v1.2 -Pwebapi-redshift
mvn install -e -Dmaven.test.skip=true -Dredshift.classpath=/var/local/drivers/redshift-v1.2 -Pwebapi-postgresql,webapi-redshift
For JDBC Driver, we used latest availabkle officially from AWS: https://s3.amazonaws.com/redshift-downloads/drivers/jdbc/1.2.55.1083/RedshiftJDBC42-1.2.55.1083.zip
Could you please check this branch in your environment?
@konstjar @ssuvorov-fls thank you both for investigating this problem and proposing the solution in the issue-1939-redshift-driver-downgrade-sdk
branch. I had a few questions upon my review:
issue-1939-redshift-driver-downgrade-sdk
branch, the com.amazon.redshift
dependency was removed from the pom.xml in favor of using the maven command to install the profile instead. It appears that the webapi-redshift
profile is doing 2 things: specifying the redshift.classpath property
and also specify the redshift dependencies that are required for IAM support. Could we instead restore the 'com.amazon.redshift` dependency so that the driver is automatically downloaded and if someone wants to use IAM, they could follow the maven initalize/install steps above?@anthonysena
@anthonysena
The driver can't be automatically downloaded because it is not located in maven repository. Maven central has only versions started from 2.0.0.3
So it must be installed first.
If we will restore the com.amazon.redshift
dependency and call mvn compile
without setting the required profile then we'll get error
@ssuvorov-fls could we use one of the versions listed on https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc4-no-awssdk? The most recent is v1.2.43.1067 - not sure if that poses any problems but it is > v1.2.8.
Noting that the latest driver is found here: https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42-no-awssdk?repo=mulesoft-public
@ssuvorov-fls - thanks for the update here. I tested using the 1.2.55.1083
version of the driver and it also exhibited the same problem as originally reported. I'd suggest that we move back to 1.2.10.1009
since this is > 1.2.8 and should allow for the IAM support required.
@konstjar let me know if there are any objections and I can file the PR based on https://github.com/OHDSI/WebAPI/commit/b1cf1d26598f6d586a3b3728339ffaa74d3535e8
Thanks @ssuvorov-fls - can you file a PR? I'll review & approve. Thanks!
A quick update on this issue: @chrisknoll and I worked together to figure out the RedShift JDBC drivers <= 1.2.45.1069 work when generating 2 cohorts in parallel on the same data source. When we moved to v1.2.47.1071 (and even later versions), we observed the error mentioned earlier in this issue.
Here is a link to this driver on Maven Central: https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42-no-awssdk
I think this was resolved and could be closed?
I'm not sure: last I recall when I tried this with Sena was that we went through about a dozen different versions of the jdbc driver, and we continue to experience the error. @anthonysena , am I recalling that correctly?
That's correct @chrisknoll - looking at the current pom.xml, we're still using v1.2.10.1009 which does not exhibit this problem. We could note this as a "known issue" and leave it open since we've yet to address the root cause.
Long time passed since last discussion. We have few same issues on Redshift appeared again but in pure R code using the DatabaseConnector using v2.x driver version. The DatabaseConnector package has JDBC driver v2.x.
Just for visibility purposes and general discussion I would like to ask @schuemie here if there are any reporters of the same "serializable isolation violation" problem and if you can propose any solution for this? From our research, it's possible to alter database in Redshift to enable snapshot isolation. I just wonder if it's the right way to go.
Expected behavior
When generating > 1 cohort on a single data source, the cohort generation processes finish without error on RedShift.
Actual behavior
Generating > 1 cohort on RedShift using WebAPI v2.10.0 generates the following exception:
Steps to reproduce behavior
Generate > 1 cohort on RedShift using WebAPI v2.10.0.
Additional Notes
As part of the v2.10.0 release, we updated the RedShift JDBC drivers: https://github.com/OHDSI/WebAPI/pull/1925/files. As a test, we can try to roll back to using a 1.2.x build to see if that allows us to work-around this problem while we do a deeper dive.