rcxwhiz / django-hierarchical-models

MIT License
1 stars 0 forks source link

Go back and add optimized model methods #25

Closed rcxwhiz closed 4 months ago

rcxwhiz commented 4 months ago

There are some models I added which could have further optimized methods. I am not planning on going back and adding those for 1.0. I would like to go ahead and get that out, and then start going back and fixing this stuff.

rcxwhiz commented 4 months ago

After doing some initial benchmarks, there are some performance issues that need to be addressed.

First is the NestedSetModel performance is much, much worse than expected across the board. At a minimum there needs to be some indexing or something going on. The querying is relatively worse than the editing, which is unexpected. There are even things like the test that just adds a bunch of nodes, NSM gets hung up on that and it must be from searching every single instance to find the highest _right value over and over again.

The children query on the PathEnumerationModel looks a tad slow. I think there might be something to be done there where you can get all the children, then order them by depth. Delete parent looks like it's going really well.

The only thing with AdjacencyListModel is the set parent looks a little long. I expected it to be a little long, but this is a little longer than that. I don't know what the issue would be there off the top of my head.

rcxwhiz commented 4 months ago

The methods that still need improvement: PathEnumerationModel

NestedSetModel

My prediction is that in the end only three models will make sense for various reasons. AdjacnencyListModel for when you need fast edit performance (with unchecked functions), PathEnumerationModel when you need better performance at higher numbers of instances (especially once I get that improved children() method going, and some sort of string based path enumeration model to use as a fallback with oracle and sqlite databases. In the end if there isn't a significant difference between them, it could all come down to one model, but that is yet to be seen.

rcxwhiz commented 4 months ago

After doing more benchmarking and more testing, it would appear that I have been pretty wrong about what my bottlenecks are. It turns out that calls to refresh_from_db are very costly. By cutting them out I was able to shorten most ALM calls by 97%. I can do this with ALM because that stuff is automatically handled by the database, so I don't have to worry about synchronizing stuff between instances. It is for this reason that I don't think the other models are ever going to be as fast as this one. In theory they are just as good, if not better, but the amount of database queries you are introducing with the constant refresh_from_db calls cannot be overcome.

I think what this is going to call for is a 2.0 release that no longer has multiple implementations and is no longer db dependent in any way. Just one really good ALM implementation. As far as doing this at the ORM level goes, you can't beat the fact that ALM only needs the built in functionality.

rcxwhiz commented 4 months ago

Time to merge this as soon as tests are passing, release 1.2.0, then move on to 2.0.