WASdev / standards.jsr352.jbatch

Home of 'jbatch', a compatible implementation of the Jakarta Batch specification (and the former Reference Implementation for the JSR 352, Batch Applications for the Java Platform specification).
Other
21 stars 18 forks source link

Set SCHEMA is not valid SQL in MySQL #11

Closed smillidge closed 54 years ago

smillidge commented 9 years ago

The code below from JDBCPersistenceManagerImpl throws the exception below on MySQL;

Caused by: com.ibm.jbatch.container.exception.PersistenceException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SCHEMA 'batch'' at line 1

at com.ibm.jbatch.container.services.impl.JDBCPersistenceManagerImpl.createJobInstance(JDBCPersistenceManagerImpl.java:1714)

private void setSchemaOnConnection(Connection connection) throws SQLException { logger.finest("Entering " + CLASSNAME +".setSchemaOnConnection()"); PreparedStatement ps = null; ps = connection.prepareStatement("SET SCHEMA ?"); ps.setString(1, schema); ps.executeUpdate(); ps.close(); logger.finest("Exiting " + CLASSNAME +".setSchemaOnConnection()"); }

scottkurz commented 9 years ago

Yes, we didn't test against MySQL. Would be happy to accept a fix, even if it opens the door to the need for a more general DB-specific approach to organizing the SQL.

tom-haines commented 9 years ago

There is a discussion of this issue in [Bug 5315 - Broken PostgreSQL Support in SE mode] : https://java.net/bugzilla/show_bug.cgi?id=5315

I think the option Scott mentioned previously to add SET_DEFAULT_SCHEMA=TRUE/FALSE as an option in batch-config.properties should solve the MySQL issue.

Am happy to put forward a fix for review if SET_DEFAULT_SCHEMA boolean option is considered acceptable for the SE RI.

(Not required for MySQL support): As a further addition, to address mbogner2's concerns, the following batch-config.properties option would allow for an application to set a custom schema where the jdbc driver does not allow it to be set via the connection url: DB_INIT_STATEMENT=

On Fri, Oct 24, 2014 at 5:12 AM, Scott Kurz notifications@github.com wrote:

Yes, we didn't test against MySQL. Would be happy to accept a fix, even if it opens the door to the need for a more general DB-specific approach to organizing the SQL.

— Reply to this email directly or view it on GitHub https://github.com/WASdev/standards.jsr352.jbatch/issues/11#issuecomment-60310102 .

smillidge commented 9 years ago

What may be easier rather than trying to catch specific database quirks with properties and if statements would be to make JDBCPersistenceManagerImpl able to be subclassed by say a; MySQLJDBCPersistenceManagerImpl therefore it would be possible to make the setSchemaOnConnection method a noop.

currently it is private but if it was protected it would become more extensible.

tom-haines commented 9 years ago

The SET_DEFAULT_SCHEMA would solve the issues with MySQL and PostgreSQL (and any others where the connection url is used or a different keyword), without requiring subclasses for both/each engine. I suppose it depends on the likelihood of other DB-engine specific quirks becoming relevant. Given that the table structure is simple and there are no triggers or procedures in the DDL, it seems likely that only SQL-92 compliance is needed, and therefore only the JDBC driver and connection properties should be required?

smillidge commented 9 years ago

There are a few other related SQL incompatibilities currently reported on GlassFish JIRA. I'm currently investigating fixing these e.g. https://java.net/jira/browse/GLASSFISH-20886 https://java.net/jira/browse/GLASSFISH-21022

for Oracle's SQL quirks. Putting the infrastructure in place so we could overload worst case would be a great enhancement and reduce the impact on the RI.

scottkurz commented 9 years ago

Sorry I thought I had a notification set up and missed this discussion. Will have to pick this up again next week.

scottkurz commented 9 years ago

Thanks for your interest on these issues.

Some thoughts skimming over the various issues:

  1. As far as the SET SCHEMA issue, I'd like to avoid a separate config and simply use the existing 'schema' config property across all DBs by changing all table references to qualified references (rather than relying on the SET SCHEMA having been done).

    This sounds like mbogner2's suggestion in a comment in the Bug 5315 mentioned by sans-gravitas.

  2. Removing the AS keyword GLASSFISH-21022 is certainly easy. Was planning on doing it the next time I hit that part anyway or someone could send me that.
  3. Not 100% sure I caught every issue in GLASSFISH-20886. But as far as the issue accessing the generated key.. I'm aware of this issue but not an expert.

    Say we were to change from this, which doesn't work w/ Oracle apparently:

    < statement = conn.prepareStatement("INSERT INTO jobinstancedata (name, apptag) VALUES(?, ?)", Statement.RETURN_GENERATED_KEYS);

    to this:

    > statement = conn.prepareStatement("INSERT INTO jobinstancedata (name, apptag) VALUES(?, ?)", new String[] {"jobinstanceid"});

    Might this be good enough? Does anyone know when this breaks down?

  4. We could experiment with strategies for adding the DB-specific elements. I just don't want anyone to spend too much effort on too good a solution yet. I'd merge something if it fixed an existing problem and kept things moving forward.

    But we still could end up going later in a completely different direction (like, say, using JPA as the Apache BatchEE fork of the RI has apparently done to some degree... haven't had the chance to evaluate that yet). (Note there actually are triggers in the Oracle DDLs at this point). Didn't want anyone to feel bad if we left any intermediate solution behind later.

Finally, please also remember that IBM is requiring a CLA before reviewing pull requests.

scottkurz commented 9 years ago

Made an attempt at addressing 2. and 3. for Glassfish w/ Oracle from my last comment in commit 52a51e0

Didn't get to my suggestion for 1. but did a quick bypass of SET SCHEMA based on Oracle conn metadata. Not sure that's worth building on.

scottkurz commented 9 years ago

Hadn't meant to close the issue.

tom-haines commented 9 years ago

Adding MySQL as an additional exclusion to the bypass approach would solve the MySQLSyntaxErrorException.

smillidge commented 9 years ago

I agree MySQL fails on SET SCHEMA.

BTW @scottkurz I can get the Payara team to sign the Contributor Agreement so we can help do some of the work. Where do we send the scanned agreements?

scottkurz commented 9 years ago

You can simply email the CLA to me at skurz@us.ibm.com. Thanks

scottkurz commented 9 years ago

@smillidge, I know we have a more complete extension in the works in pull request #17. As @tom-haines suggested, though, we could just use the same approach as for Oracle DB in the shorter term. Are you interested in contributing that?

smillidge commented 9 years ago

At the moment we are going for the approach of implementing the Persistence service in Payara and using the property overrides to get it in there. We currently have basic MySQL, DB2, Derby, Oracle and Postgres implementations written, although they require some significant refactoring. I'd rather go for subclassing in the longer run as it is way cleaner. I think trying to work around db idiosyncrasies with if statements is going to produce a whole world of pain.

I agree the persistence service is not ready for prime time as a good generic interface though. I'd be happy to work with you on cleaning that up then contribute all the DB services we have.

scottkurz commented 9 years ago

I took another look at this just now and I have a good idea of the refactoring I'd like to make to expose a more minimal set of interfaces to extensions. Should have this out in the next couple of weeks.

smillidge commented 9 years ago

OK thanks Scott.

Like I say we have a bunch of completed persistence services including MySql, PostGres, DB2, Oracle we can contribute back once we refactor to match whaterver persistence service interface you devise. Ideally I'd like an interface that is not tied to SQL as file based, in-memory and JSR107 JCache based stores would be interesting to build among others.

Do you know when 1.0.1 will be pushed to maven central?

scottkurz commented 9 years ago

Steve,

I don't know when 1.0.1 will be released... I'm not too clear on the exact details of the integration testing, so for this round I'm still waiting to hear confirmation it's cleared.

As far as how the refactoring is going....

First, I'm taking it for granted that we remain backwards compatible with existing table entries. This interface is really all over the place, so we'd probably go further if this were not a concern.

So I'm continuing to use the existing objects serialized into the current tables: (CheckpointData, JobStatus, StepStatus) while working to extract out a different interface that doesn't depend on these particular implementations.

One piece I'd forgotten about upfront is the need to describe the "internal" ids used for the partition and split-flow cases in Javadoc.. these ids carry with them relations to other entries which unfortunately are not captured as relations in the current table design.

Since you said you already have impls... I'm not going to hold this up just until I get around writing the Javadoc. But I at least want to do a bit of thinking through as I do this exercise.

Finally, there's just some real simple consistency/cleanup to do: consistent parm names and just put the methods in some kind of order !

Let me ask: Except for the java.sql.Timestamp, which I guess we should switch to java.util.Date, I don't see the existing interface as tied to SQL... was there anything else you noticed you were concerned about?

Thanks

smillidge commented 9 years ago

Hi Scott,

Keeping the existing table structure and data compatibility is a good idea for 1.0.1 as there will be existing data out there.

Not worried about anything else just the removal of dependencies on java.sql.* like the one you mention above.

Steve

scottkurz commented 9 years ago

Ok, I have something to share.. on my personal fork: git pull https://github.com/scottkurz/standards.jsr352.jbatch.git Issue11-prototype

This removes what I consider "private" classes from the persistence service interface. It adds one package, com.ibm.jbatch.container.services to the bundle's Export-Package list.

This maybe isn't complete..not everything in com.ibm.jbatch.container.services right now is really pluggable. But all the new classes/interfaces are: CheckpointDataKey, CheckpointDataPair, IJobStatus, IStepStatus.

On the other hand, things like StepContextImpl, RuntimeJobExecution are gone from the persistence interface. Also gone from the interface (and not exported) are all classes that were directly serialized (and bytes written to the DB): eg CheckpointData, JobStatus, StepStatus.

I didn't want to have to support any other users of these classes. If the deserialization logic ever needs to change that's going to be complicated enough to deal with. Unfortunately this might make you to copy/paste very similar impls (of say JobStatus). I'm open to suggestions but I just don't want to expose such a class directly.

All in all, this leaves us with a lot to be desired in terms of consistency of this interface ... the basic CRUD methods don't follow a very consistent pattern from "table" to "table". But at least I think I've create a functioning interface that could be extended.

I also changed from java.sql.Timestamp to java.util.Date as you requested. It seemed there might be some logic to saying "keep the interface in terms of Timestamp, and even if we only pass around Date(s) now that would still leave the door open to passing Timestamp (of higher precision) later".

However, the spec APIs reference Date(s; we're not likely to do such work any time soon, and Steve's got the implementations ready and has requested Date types (rather than having to code up for a non-existent nanosecond precision). So I like going with Date.

Please let me know what you think.

I would like to do some more testing on back compatibility with existing table entries, some Javadoc, and some other cleanup as well.

smillidge commented 9 years ago

Hi Scott, thanks for that I'll take a look this weekend.

smillidge commented 9 years ago

Hi Scott,

It is certainly looking a lot cleaner as an interface. I haven't had a chance to build anything on the back of the interface but I can certainly start doing that over the next week or so to see how it feels. Removing your Data Objects is nice even if people will have to invent their own. They seemed a little tied to your tables.

I'll see if I can build a non-sql persistent store class that implements the interface over the next couple of weeks and let you know.

Steve

smillidge commented 9 years ago

I suggest;

public long createJobExecution(JobInstance jobInstance, Properties jobParameters, BatchStatus batchStatus, Date createTime);
public long createStepExecution(long jobExecId, StepContext stepContext)

Should return JobExecution and StepExecution respectively

scottkurz commented 9 years ago

What do you see as the advantage for returning JobExecution and StepExecution respectively?

I'd been thinking these methods were kind of odd in that they at face value leave open creation in any state at all, rather than just STARTING. I don't think the rest of jbatch exercises this flexibility, so thought we could close that down potentially (assuming reviewing code shows I'm correct in this understanding). Same with the metrics... why not just say they're initialized to '0'. (Well, that is if we were saying anything..in fact the methods are not actually documented still).

But any way..curious to hear why you suggested that.

smillidge commented 9 years ago

No reason other than the other create methods return objects rather than IDs. I personally prefer similar semantics for similar methods when designing apis. I haven't looked closely though.

scottkurz commented 9 years ago

I know I've been silent.. I was waiting to hear that the Glassfish integration test had completed (which would allow me to commit 1.0.1 and move on), but am still waiting for confirmation.

Not that that completely prevents us discussing this.. just pointing out that I've put this on hold waiting to be done with 1.0.1

Will post an update when I hear one.

smillidge commented 9 years ago

No problem. 1.0.1 is successfully integrated with Payara which is derived from GlassFish.

scottkurz commented 7 years ago

I realize there has been a long time since this issue was opened. Thank you for the original offers to fix this and the discussion for enhancing the internal persistence interface, (which I know we never completed). The narrow fix has now been solved by pull request #52, so I'm going to close this. Sure, if we knew if this is all we were going to do we could have done that awhile back, so apologies for that.