stefankroes / ancestry

Organise ActiveRecord model into a tree structure
MIT License
3.74k stars 462 forks source link

Adding a "/" at the end can reduce the amount of "OR" queries #563

Closed KaviiSuri closed 2 years ago

KaviiSuri commented 2 years ago

It is pretty common in my experience to get all descendants of a record in the tree. Currently, a descendant of a record can have "ancestry" column of the following forms:

The "OR" in the query seems unnecessary and can be easily removed if we just add "/" at the end of ancestry always, effectively changing the forms to

Is there a specific reason ancestry doesn't do this and has decided to not have a "/" at the end of single ancestors?

kbrock commented 2 years ago

Outstanding resources

The research materials/presentations for tree structures in a number of programming languages all had the same implementation. And that implementation without the trailing slash looks obviously wrong.

Implementation

I have been pulling the materialized path implementation into a separate class so we could support multiple tree implementations: traditional materialized path, the trailing slash, postgres arrays, ltree, path with the id in it.

Check out: https://github.com/kbrock/ancestry/tree/materialized_path2 I just rebased so it should be in a usable state.

Results

When I ran the numbers for trailing slash it was not any faster.

Issues I see:

Future steps

While I totally agree that the current implementation that everyone uses is probably wrong, I would like to actually know which actions/queries use indexes and which table scan.

So I keep getting derailed by wanting a test suite to actually test this stuff out.

dimvic commented 2 years ago

@kbrock Were the results using the pg array approach better? In fact, I was looking into this not because of speed, but because having a trailing and maybe a leading slash would make implementing custom scopes a much cleaner and easier task.

kbrock commented 2 years ago

@dimvic Yes. I too want a way to determine the root_id, parent_id, ancestor_ids, and child_ancestry values in pure sql without resorting to using a CASE. The current implementation sure leans towards wanting to use ruby to parse.

I want to use a serializer to parse/create the ancestry value. If we do that, then the value of ancestry would be an array - basically the same as ancestor_ids. It would remove a lot of complexity. But I don't know how much this breaks the code that uses ancestry. I know in our code, there are some assumptions that ancestry is a string with a / delimiter.

Very tempted to breaking this functionality in the next major version of ancestry. But I'm hesitant too

kbrock commented 2 years ago

Let me know if you end up trying out the strategy of materialized_path2 and if it works alright for you.