datanucleus / datanucleus-rdbms

DataNucleus support for persistence to RDBMS Datastores
29 stars 66 forks source link

[MYSQL] Column name mismatch when checking for primary key specifications #496

Closed Aggarwal-Raghav closed 2 weeks ago

Aggarwal-Raghav commented 3 weeks ago

During Apache Hive datanucleus version upgrade to 6.x (for java 17 compatible), I observed this issue. I have created a sample project for reproducing this issue with datanucleus 6.x and have attached the screenshots for debugging (possible problematic code) and stacktrace.

Here is the project URL: https://github.com/Aggarwal-Raghav/DatanucleusIssue496

The same is working for datanucleus 5.x verison which is currenly used by Apache Hive. Have tested in branch: workingCase

Commit causing behaviour change between datanucleus 5.x and 6.x: https://github.com/datanucleus/datanucleus-rdbms/commit/b99df3c803862f9e9feacc7891461bcb21160b38

For reference attaching the stacktrace:

NestedThrowablesStackTrace:
Table "`PERSON`" has been specified with a primary-key to include column "`PRODUCT_NAME`" but this column is not found in the table. Please check your <primary-key> column specification.
org.datanucleus.exceptions.NucleusUserException: Table "`PERSON`" has been specified with a primary-key to include column "`PRODUCT_NAME`" but this column is not found in the table. Please check your <primary-key> column specification.
    at org.datanucleus.store.rdbms.table.ElementContainerTable.applyUserPrimaryKeySpecification(ElementContainerTable.java:224)
    at org.datanucleus.store.rdbms.table.CollectionTable.initialize(CollectionTable.java:236)
    at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3449)
    at org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:3072)
    at org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:118)
    at org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1666)
    at org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:670)
    at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.getDatastoreClass(RDBMSPersistenceHandler.java:110)
    at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:196)
    at org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:4614)
    at org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:4591)
    at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2076)
    at org.datanucleus.ExecutionContext.persistObjectInternal(ExecutionContext.java:320)
    at org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1924)
    at org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1785)
    at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:697)
    at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:722)
    at com.github.raghav.Main.main(Main.java:24)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:569)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:290)
    at java.base/java.lang.Thread.run(Thread.java:840)
Aggarwal-Raghav commented 3 weeks ago

@andyjefferson, can you please provide your insights on this?

andyjefferson commented 3 weeks ago

Can't reproduce any problem. I put your "test" into the standard template format (as the docs request), and put it here https://github.com/andyjefferson/rdbms-496/tree/master

I also removed the makePersistent of the Person objects (which are embedded-only, so should not be persisted in their own right anyway). I saw no issue with any schema handling. Using Linux, and MariaDB. All is saved in UPPERCASE

CREATE TABLE PRODUCT ( PRODUCT_ID BIGINT NOT NULL AUTO_INCREMENT, PRODUCT_DESCRIPTION VARCHAR(767) BINARY NULL, PRODUCT_NAME VARCHAR(128) BINARY NULL, CONSTRAINT PRODUCT_PK PRIMARY KEY (PRODUCT_ID) ) ENGINE=INNODB

If using Windows, as the docs have always said, you may need to set "datanucleus.identifier.case" as "lowercase". Suggest that you revise what this test shows to match reasonable behaviour. If you want the Person objects persisted into a Person table then remove embedded-only.

Aggarwal-Raghav commented 3 weeks ago

@andyjefferson , Thanks for the quick reply and converting project to test template. The same project https://github.com/andyjefferson/rdbms-496/tree/master when I try with mysql (https://github.com/andyjefferson/rdbms-496/pull/1), it is throwing the below error

Screenshot 2024-08-20 at 9 47 21 PM
andyjefferson commented 3 weeks ago

If some column is "not found in the table" then define what IS found in the table. Only you can see your database and what is created (start with the SQL statement to create the table). Given that those col names have quotes then they need to be UPPER CASE. If you haven't defined them in UPPER CASE (or set that persistence property I mentioned) then change your settings.

Sadly I only have MariaDB, so cannot use MySQL.

This is the log of the schema generation I get with MariaDB

17:43:21,655 (main) DEBUG [DataNucleus.Datastore.Schema] - CREATE TABLE PRODUCT
(
    PRODUCT_ID BIGINT NOT NULL AUTO_INCREMENT,
    PRODUCT_DESCRIPTION VARCHAR(767) BINARY NULL,
    PRODUCT_NAME VARCHAR(128) BINARY NULL,
    CONSTRAINT PRODUCT_PK PRIMARY KEY (PRODUCT_ID)
) ENGINE=INNODB
17:43:21,658 (main) DEBUG [DataNucleus.Datastore.Schema] - SQL Execution Time = 3 ms
17:43:21,661 (main) WARN  [DataNucleus.Datastore] - SQL Warning : Name 'PRODUCT_PK' ignored for PRIMARY key.
17:43:21,662 (main) DEBUG [DataNucleus.Datastore.Schema] - Loading column info for table(s) "PRODUCT" in Catalog "", Schema ""
17:43:21,664 (main) DEBUG [DataNucleus.Datastore.Schema] - Column info loaded for Catalog "", Schema "", 1 tables, time = 1 ms
17:43:21,664 (main) DEBUG [DataNucleus.Datastore.Schema] - Column info retrieved for table "PRODUCT" : 3 columns found
17:43:21,665 (main) DEBUG [DataNucleus.Datastore.Schema] - Check of existence of PRODUCT_PERSON returned no table
17:43:21,665 (main) DEBUG [DataNucleus.Datastore.Schema] - CREATE TABLE PRODUCT_PERSON
(
    PRODUCT_NAME BIGINT NOT NULL,
    PERSON_AGE BIGINT NOT NULL,
    CITY VARCHAR(256) BINARY NULL,
    PERSON_NAME VARCHAR(128) BINARY NOT NULL,
    CONSTRAINT PRODUCT_SUPPLIER_PK PRIMARY KEY (PRODUCT_NAME,PERSON_NAME)
) ENGINE=INNODB
17:43:21,668 (main) DEBUG [DataNucleus.Datastore.Schema] - SQL Execution Time = 3 ms
17:43:21,668 (main) WARN  [DataNucleus.Datastore] - SQL Warning : Name 'PRODUCT_SUPPLIER_PK' ignored for PRIMARY key.
17:43:21,668 (main) DEBUG [DataNucleus.Datastore.Schema] - Loading column info for table(s) "PRODUCT_PERSON" in Catalog "", Schema ""
17:43:21,669 (main) DEBUG [DataNucleus.Datastore.Schema] - Column info loaded for Catalog "", Schema "", 1 tables, time = 1 ms
17:43:21,669 (main) DEBUG [DataNucleus.Datastore.Schema] - Column info retrieved for table "PRODUCT_PERSON" : 4 columns found
17:43:21,670 (main) DEBUG [DataNucleus.Datastore.Schema] - CREATE UNIQUE INDEX UNIQUETYPE ON PRODUCT (PRODUCT_NAME)
17:43:21,675 (main) DEBUG [DataNucleus.Datastore.Schema] - SQL Execution Time = 5 ms
17:43:21,677 (main) DEBUG [DataNucleus.Datastore.Schema] - CREATE INDEX PRODUCT_PERSON_N49 ON PRODUCT_PERSON (PRODUCT_NAME)
17:43:21,683 (main) DEBUG [DataNucleus.Datastore.Schema] - SQL Execution Time = 6 ms
17:43:21,685 (main) DEBUG [DataNucleus.Datastore.Schema] - ALTER TABLE PRODUCT_PERSON ADD CONSTRAINT PRODUCT_PERSON_FK1 FOREIGN KEY (PRODUCT_NAME) REFERENCES PRODUCT (PRODUCT_ID) 
17:43:21,693 (main) DEBUG [DataNucleus.Datastore.Schema] - SQL Execution Time = 8 ms
Aggarwal-Raghav commented 3 weeks ago

My bad, made a mistake in package.jdo file. Will correct it and revert back to you.

Aggarwal-Raghav commented 2 weeks ago

As per our discussion in the mail, adding the details here for future context:

I have written a new test case similar to hive package.jdo. I have tested this testcase with 5.x (Java8) and it is passing (Attaching the datanucleus logs and mysql tables screenshot) but the same testcase with datanucleus 6.x is failing. Attaching the failure logs for the same.

Project URL: https://github.com/Aggarwal-Raghav/rdbms-496 branch master: For java 17, datanucleus 6.x branch 5.x: For java8, datanucleus 5.x

When I debugged the failure case I found that the string comparison in applyUserPrimaryKeySpecification is causing this issue because of backticks present in column name. Attaching the debug screenshot also.

datnucleus_5.x_success.log datnucleus_6.x_failure.log hive_err.txt

Mysql_Datanucleus_5 x DEBUG
Aggarwal-Raghav commented 2 weeks ago

Thanks @andyjefferson , for the quick fix. For me the fix is working. The UT also passed and Apache Hive HMS is also starting. I will check with Hive community also to verify the fix.

Is there any plan to release 6.0.8 anytime soon?

andyjefferson commented 2 weeks ago

Marking as fixed. Aim to release 6.0.8 in the next few days

andyjefferson commented 2 weeks ago

FYI, 6.0.8 released

Aggarwal-Raghav commented 2 weeks ago

Thanks @andyjefferson