datanucleus / datanucleus-rdbms

DataNucleus support for persistence to RDBMS Datastores
30 stars 66 forks source link

Provide mechanism for 1-1 owned / N-1 FK relation to be marked to fetch the "PK" only and not instantiate (when not in FetchPlan) #370

Closed andyjefferson closed 2 years ago

andyjefferson commented 3 years ago

Some people who used the likes of Kodo want a query / getObjectById to pull in "PK" value(s) of related 1-1 (owned) or N-1 FK object since the "FK" is at the owners side of the relation. Importantly this is to NOT instantiate the related object, just save its "PK" value(s) for later in case it is ever needed.

ExecutionContext has a field stateManagerAssociatedValuesMap that can be used to store such things, along with associated methods. The first thing for a developer to decide is how to use the FetchGroup to define when a field should be not in the FetchGroup yet should have its value(s) fetched like this.

Low priority for general development since it is minority use-case and not been "needed" in all the time this project has existed. Clearly somebody could pick this task up and develop it.

andyjefferson commented 3 years ago

See https://github.com/datanucleus/datanucleus-core/issues/29 and https://issues.apache.org/jira/projects/JDO/issues/JDO-663?filter=allopenissues

andyjefferson commented 3 years ago

The "mechanism" could be that any such field has to be put in the FetchGroup with recursion-depth as 0 to mark the field as to be fetched but not instantiated. That FK would then need storing in the StateManager, and then on any access of that field it checks for the FK being stored and simply instantiates it.

andyjefferson commented 3 years ago

See initial TODOs at https://github.com/datanucleus/datanucleus-rdbms/blob/master/src/main/java/org/datanucleus/store/rdbms/request/FetchRequest.java#L468 https://github.com/datanucleus/datanucleus-rdbms/blob/master/src/main/java/org/datanucleus/store/rdbms/query/PersistentClassROF.java#L478 where we would need to detect which fields only need value loading but not instantiating and store in DNStateManager. Note that FetchRequest currently has no concept of fields to fetch as normal and fields to fetch as FK, so the input (from StateManager) would need updating back at that point to also pass in FetchPlanForClass which contains the info about which are fields just for their FK.

andyjefferson commented 2 years ago

The attached tests demonstrate usage. rdbms-370.zip

In short, define a FetchGroup with whichever member is to be "stored" as having "recursion-depth" set to 0. This will only select the FK column(s) and then use it if that field is ever needed to be loaded.

chrisco484 commented 2 years ago

Normally the "default" fetch strategy will not result in related objects being fetched i.e. an explicit FetchGroup needs to be set up to force a related object to be loaded when the primary object is loaded. In keeping with that philosophy is there a case for considering if the fetch behaviour associated with this new change should be the default behaviour when no FetchGroup / recursion depth has been set?

andyjefferson commented 2 years ago

No plan on changing the DFG definition. It's a new feature that people haven't tested ...

chrisco484 commented 2 years ago

Maybe a DB configuration option to turn it on universally but the option itself defaults to false?

andyjefferson commented 2 years ago

Oooh a volunteer ...

andyjefferson commented 2 years ago

Sample code

@PersistenceCapable(detachable="true")
@FetchGroup(name="fk-load", members={@Persistent(name="name"), @Persistent(name="details", recursionDepth=0)})
public class Person
{
    @PrimaryKey
    Long id;

    String name;
    int age;
    PersonalDetails details;
}
@PersistenceCapable(detachable="true")
public class PersonalDetails
{
    @PrimaryKey
    Long id;

    String nhsNumber;
}

when doing a FETCH

pm.getFetchPlan().addGroup("fk-load");
Person p1 = pm.getObjectById(Person.class, 1);

used to do

Object with id "mydomain.model.Person:1" not found in Level 1 cache
Object with id "mydomain.model.Person:1" not found in Level 2 cache
Object with id "mydomain.model.Person:1" added to Level 1 cache (loadedFlags="[NNYN]")

SELECT A0.AGE,B0.ID,B0.NHSNUMBER,A0."NAME" FROM PERSON A0 LEFT OUTER JOIN PERSONALDETAILS B0 ON A0.DETAILS_ID_OID = B0.ID WHERE A0.ID = ?

Object with id "mydomain.model.PersonalDetails:201" not found in Level 1 cache
Object with id "mydomain.model.PersonalDetails:201" not found in Level 2 cache
Object with id "mydomain.model.PersonalDetails:201" added to Level 1 cache (loadedFlags="[YN]")
Object with id "mydomain.model.PersonalDetails:201" added to Level 2 cache (fields="[0, 1]", version="")
Object with id "mydomain.model.Person:1" added to Level 2 cache (fields="[0, 1, 2, 3]", version="")

and now does

Object with id "mydomain.model.Person:1" not found in Level 1 cache
Object with id "mydomain.model.Person:1" not found in Level 2 cache
Object with id "mydomain.model.Person:1" added to Level 1 cache (loadedFlags="[NNYN]")

SELECT A0.AGE,A0.DETAILS_ID_OID,A0."NAME" FROM PERSON A0 WHERE A0.ID = ?

Object with id "mydomain.model.Person:1" added to Level 2 cache (fields="[0, 2, 3]", version="")

With a QUERY

Query q = pm.newQuery("SELECT FROM " + Person.class.getName());
q.getFetchPlan().addGroup("fk-load");
List<Person> people = q.executeList();

it used to do

SELECT 'mydomain.model.Person' AS DN_TYPE,A0.AGE,B0.ID,B0.NHSNUMBER,A0.ID,A0."NAME" FROM PERSON A0 LEFT OUTER JOIN PERSONALDETAILS B0 ON A0.DETAILS_ID_OID = B0.ID

Object with id "mydomain.model.Person:3" not found in Level 1 cache
Object with id "mydomain.model.Person:3" not found in Level 2 cache
Object with id "mydomain.model.Person:3" added to Level 1 cache (loadedFlags="[NNYN]")
Object with id "mydomain.model.PersonalDetails:202" not found in Level 1 cache
Object with id "mydomain.model.PersonalDetails:202" not found in Level 2 cache
Object with id "mydomain.model.PersonalDetails:202" added to Level 1 cache (loadedFlags="[YN]")
Object with id "mydomain.model.PersonalDetails:202" added to Level 2 cache (fields="[0, 1]", version="")
Object with id "mydomain.model.Person:3" added to Level 2 cache (fields="[0, 1, 2, 3]", version="")

and now does

SELECT 'mydomain.model.Person' AS DN_TYPE,A0.AGE,A0.DETAILS_ID_OID,A0.ID,A0."NAME" FROM PERSON A0

Object with id "mydomain.model.Person:3" not found in Level 1 cache
Object with id "mydomain.model.Person:3" not found in Level 2 cache
Object with id "mydomain.model.Person:3" added to Level 1 cache (loadedFlags="[NNYN]")
Object with id "mydomain.model.Person:3" added to Level 2 cache (fields="[0, 2, 3]", version="")