A bunch'a questions - Githubissues

Reaching out to the rockstars of Neo4j, for your wisdom, and expertise!

Is it possible to pass a transaction from one function to another, or one class to another?
Is it possible to commit part of a transaction, do some more stuff, and then commit the rest?
Is it possible to eager load members rather than lazy load them?
Is it possible to patch update an attribute of a node rather than do a full update of the node?
What kinds of event handlers are available?
Maybe using event handlers, is there a good pattern for building an in-memory self-updating copy of the database (or parts of it), so I don't have to hit the database on every read?

Is it possible to pass a transaction from one function to another, or one class to another? Transactions are passed down the call tree. This means that functions you call will use the callers its transaction automatically. If you want to have a reference to the transaction you can use "Transaction.Current". But still you cannot use the transaction outside of it's scope. The scope being inside the using block and any calls made to sub-routines inside the using block.
Is it possible to commit part of a transaction, do some more stuff, and then commit the rest? Yes and no, Neo4j does not have support for nested transactions the same way how for example SQL Server has them. But you can do something to simulate it.
- If you want the already committed part to retain even if the full transaction gets rolled back, you can simply make two transactions in a row. Commit the first transaction then create a new transaction for the rest of the changes.
- If you want the already committed part to rollback if the full transaction gets rolled back, you can simply call "Transaction.Flush" to let Blueprint41 flush the in memory captured changes to the transaction. Then later you can commit the transaction when the rest of the changes are done. The "flushed" changes will be rolled back in case the whole transaction is rolled back.
Is it possible to eager load members rather than lazy load them? Yes, you can set a transaction to eager load when accessing data in a loop/foreach. You can select the mode when starting the transaction with:
```
using (Transaction.Begin(OptimizeFor.PartialSubGraphAccess)) // This is the default mode if OptimizeFor is omitted
{
    foreach (Department d in Company.All)
        Console.WriteLine($"Company {d.Company.Name}, Department {d.Name}");
}
```
This will execute 1 query to load all companies and 1 query per company to load the departments of that company. Total queries in this scenario is 1 + company count. If you want to load all companies, but only the departments of a small amount of companies, this OptimizeFor mode works best.
```
using (Transaction.Begin(OptimizeFor.RecursiveSubGraphAccess))
{
    foreach (Department d in Company.All)
        Console.WriteLine($"Company {d.Company.Name}, Department {d.Name}");
}
```
This will execute 1 query to load all companies and 1 query to load the departments of all companies. Total queries in this scenario is 2. If you want to load all companies, and the departments of most companies, this OptimizeFor mode works best.

If you want to mix and match the mode within the same transaction, you can set it using:
```
Transaction.Current.Mode = OptimizeFor.Xxx;
```
If you need to load data very optimized for a piece of code, you can always use type-safe queries to load it exactly the way you want.
Is it possible to patch update an attribute of a node rather than do a full update of the node? Only if you use type-safe queries with the Update clause. Be aware that changes might not be visible using the "manipulation-api" till after you flush the transaction.
What kinds of event handlers are available?
- Entity.Events.OnNew -> when the C# object is created.
- Property.Events.OnChange -> when a value is assigned to a property.
- Entity.Events.OnSave -> when an entity is about to be saved.
- Entity.Events.OnAfterSave -> after an entity was saved and the key value (Guid/FunctionalId) was retrieved from the database.
- Entity.Events.OnDelete -> when an entity is about to be deleted.
- Entity.Events.OnNodeLoading -> before the cypher to load a node will be send to the driver.
- Entity.Events.OnNodeLoaded -> when the result of the load cypher was received from the driver.
- Entity.Events.OnBatchFinished -> when the result of loading multiple nodes or deleting multiple nodes was received from the driver. (when loading or deleting a collection)
- Entity.Events.OnNodeCreate -> before the cypher to create a node will be send to the driver.
- Entity.Events.OnNodeCreated -> when the result of the created cypher was received from the driver.
- Entity.Events.OnNodeUpdate -> before the cypher to update a node will be send to the driver.
- Entity.Events.OnNodeUpdated -> when the result of the update cypher was received from the driver.
- Entity.Events.OnNodeDelete -> before the cypher to delete a node will be send to the driver.
- Entity.Events.OnNodeDeleted -> when the result of the delete cypher was received from the driver.
- Relationship.Events.OnRelationCreate -> before the cypher to create a relationship will be send to the driver.
- Relationship.Events.OnRelationCreated -> when the result of the created cypher was received from the driver.
- Relationship.Events.OnRelationDelete -> before the cypher to delete a relationship will be send to the driver.
- Relationship.Events.OnRelationDeleted -> when the result of the delete cypher was received from the driver.
- Transaction.OnBegin -> when a transaction is started.
- Transaction.OnCommit -> when a transaction is committed.
Maybe using event handlers, is there a good pattern for building an in-memory self-updating copy of the database (or parts of it), so I don't have to hit the database on every read? If you follow a few rules you could do that. You need to be aware that type-safe updates are not tracked with events, so if you use the they will not trigger any events other than the Transaction.OnCommit event. Also deleting a collection in one go will trigger the OnBatchFinished event handler that will not give enough detail about the individual nodes that have been deleted. If you put in the effort to work around these limitations, it's very possible. If you have ideas of how we can improve the event handlers to simplify what you are planning to do, let us know! One tip I can give is that you can track changes to the properties by hooking up the Property.Events.OnChange and track each property with its own handler, but if the transaction is rolled-back you might have an inconsistent state. Instead you can also hook the Entity.Events.OnSaved and check the changes that were actually committed there by comparing the sender against the sender.OriginalVersion

Beautiful response! I am saving this and adding it to documentation for my developers. It may be helpful to others if this reply could be stickied somewhere, so that it doesn't fall out of view once this issue is closed.

One of the things I am trying to solve for is DTO conversion, so I can have a well optimized API<->DataService layer in my app.

Let's say I have a list of movies, and related actors and directors.

The movie DTO might look something like:

public class MovieDTO{
    public string Uid {get; set;}
    public string Name {get; set;}
    public string GenreUid {get; set;}
    public string DirectorUid {get; set;} 
}

It seemed to me that the ideal place to do the DTO conversion would be in the constructor, like so:

public class MovieDTO{
    public string Uid {get; set;}
    public string Name {get; set;}
    public string GenreUid {get; set;}
    public string DirectorUid {get; set;} 

    public MovieDTO(Movie movie){
        this.Uid=movie.Uid;
        this.Name=movie.Name;
        this.GenreUid=movie.Genre.Uid;
        this.DirectorUid=movie.DirectorUid;
    }
}

There are several problems with this

The first problem is if movie members are not loaded, then a transaction is required for retrieving Genre and Director. In testing, it appears that if the Transaction is in a parent class, it doesn't get passed to the class constructor, meaning that the constructor errors.

using(Blueprint41.Transaction.Begin(){
    var movieObj=Movie.LoadByUid(uid);
    var moveiDto = new MovieDTO(movie);
}

Even if that were to work though, due to lazy loading, retrieving a list of movies would result in 3 database calls for each movie retrieved.

Doing this efficiently for a list of movies would require passing in the movie entity with members fully loaded. Maybe there is a way to do this with OptmizeFor.RecursiveSubGraphAccess. I definitely plan to test this. My hunch that it won't work, because the class constructor is outside of the transaction scope, so the transaction doesn't know which members are required, or how deep to go in the recursion to retrieve members.

I am puzzling that out. But fundamentally, I am not even sure if my approach to representing relationships in the DTO is aligned with good practice. What are the best practices for representing relationships in the DTO?

If you want the already committed part to retain even if the full transaction gets rolled back, you can simply make two transactions in a row. Commit the first transaction then create a new transaction for the rest of the changes.

I tried this, and I think I encountered a problem with using entities retrieved in the first transaction in the second transaction.

My memory on this is hazy... if I run into it again, I'll post a more specific use case to explore.

For your 1st comment, taking your example:

using(Blueprint41.Transaction.Begin(){
    var movieObj=Movie.LoadByUid(uid);
    var movieDto = new MovieDTO(movie);
}

OptmizeFor.RecursiveSubGraphAccess would work if you would do the following:

using(Blueprint41.Transaction.Begin(){
    var movieObj1=Movie.LoadByUid(uid1);
    var movieObj2=Movie.LoadByUid(uid2);
    var movieDto1 = new MovieDTO(movie);
    var movieDto2 = new MovieDTO(movie);
}

OptmizeFor.RecursiveSubGraphAccess would NOT work if you would do the following:

using(Blueprint41.Transaction.Begin(){
    var movieObj1=Movie.LoadByUid(uid1);
    var movieDto1 = new MovieDTO(movie);
    var movieObj2=Movie.LoadByUid(uid2);
    var movieDto2 = new MovieDTO(movie);
}

Reason being is that when inside the constructor of MovieDTO the referenced nodes are loaded, the transaction is checked to see if other movies were loaded in memory and the property being accessed is loaded for all movies that are in memory (and recorded in the transaction). If the 2nd movie is not yet in memory when accessing the related property, the load will only happen for 1 movie when inside the 1st constructor call. Inside the 2nd constructor call the load will happen again, but now for the 2nd movie, which was in memory by that time.

Also, the constructor is inside & down the call stack, and therefore will have access to the transaction. The transaction is within scope inside the constructor.

Your 2nd comment about starting a 2nd transaction, even though the content loaded during the 1st transaction is available inside the 2nd transaction, you cannot alter the object loaded in the first transaction after it has been committed or rolled back. So, lazy-loading additional data or setting properties will cause the error you mention to happen. You should reload the object (by its Uid for example) while inside the 2nd transaction, if you want to change it there.

I am not even sure if my approach to representing relationships in the DTO is aligned with good practice. What are the best practices for representing relationships in the DTO?

I think your idea of "passing the OGM object to the constructor of the DTO method" works well and is definitely a good solution in my mind.

How we did it to have an REST API with locking and versioning (backward compatibility) your could do it as follows:

We added a DateTime RowVersion field to all the entities, with a script like this:

Entities["xxx"].SetRowVersionField("VersionStamp");

Optimistic locking will then be supported in the REST API using the following flow:

GET item from REST API which includes the lock (=current RowVersion field value)
Edit item on the caller side, without changing the lock value
Send item, including the lock, back to the REST API using PUT
Api serializes the DTO, including the lock to its OGM object and commits the transaction

If the lock doesn't match the lock in the database, it means someone else changed the node while it was being edited but before it was saved in the graph. This means the optimistic lock apparently failed and the API should not be permitted to overwrite the edited node. This checking of the RowVersion lock is already included in Blueprint41 and comes for free if you add a RowVersion field to your entity. If the lock failed, an Exception will be throw inside the transaction commit and the transaction is automatically rolled back when it goes out of scope. The API then sends the client/caller a "failed" status with the reason "optimistic lock failed". The client can now restart at step 1 and get the item with its updated values, including the updated lock value.

Such an API needs conversion of both OGM to DTO and DTO to OGM. Of course adding a constructor to the OGM object is not how it's supposed to be done, so what we did is to add an explicit cast to the DTO to cast from OGM object to DTO object. We also also add an implicit cast from DTO to OGM.

The explicit cast is useful to support multiple versions of the REST API and the DTO's it uses. This way we were able to have a single database model with backward compatibility via a versioned REST API.

If you have more questions or simply want to brainstorm a bit about how this translates to your code base, feel free to contact me on Discord

circles-arrows / blueprint41

A bunch'a questions #60