stefankroes / ancestry

Organise ActiveRecord model into a tree structure
MIT License
3.73k stars 461 forks source link

anyone could give clarity to the readme? #593

Closed laptopmutia closed 1 year ago

laptopmutia commented 1 year ago

add-ancestry-column-to-your-table

Depending upon your comfort with databases, you may want to create the column with C or POSIX encoding. This is a more primitive encoding and just compares bytes. Since this column will just contains numbers and slashes, it works much better. It also works better for the uuid case as well.

Alternatively, if you create a text_pattern_ops index for your postgresql column, subtree selection will use an efficient index for you regardless of whether you created the column with POSIX encoding.

If you opt out of this, and are trying to run tests on postgres, you may need to set the environment variable COLLATE_SYMBOLS=false. Sorry to say that a discussion on this topic is out of scope. The important take away is postgres sort order is not consistent across operating systems but other databases do not have this same issue.

I got confused by this

I'm using postgresql on LINUX should I change my database table into using C or POSIX instead en_US.UTF-8 ?

laptopmutia commented 1 year ago

I found this PR though https://github.com/stefankroes/ancestry/pull/543/files

I think I could just ignore the C and POSIX but adding COLLATE_SYMBOLS=false when I testing is that true?

kbrock commented 1 year ago

hello @laptopmutia

The basic premise of materialized path is to look for models via "table"."ancestry" LIKE "/1/2/%". The index starts on the left and goes to the right, so it is able to lookup "all children" very easily since they are consecutive in the index.

This wonderful multiple byte character world has thrown a wrench into postgres's plans. Some collations (read: sort order) ignore symbols. so /1/2/3/4 is treated like 1234. and /12/34 is treated the same. Uppercase and lowercase can be sorted the same. This is a very different world from simply looking at the bytes and saying character number 65 is after character number 64.

The suggestion is to tell postgres to tread this column like a simple primitive byte string.

The COLLATE_SYMBOLS=false statement is for testing only. It tells the testing component that the collation is ignoring symbols and it will find that /12/32 will be returned from sql before /1/2/3/4.

kbrock commented 1 year ago

I am currently testing with :encoding => 'LATIN1'. It seems to run fine in postgres and mysql so we can see where it goes from there

kbrock commented 1 year ago

please ping me if you have any more questions

kbrock commented 1 year ago

@laptopmutia I have revisited this and tweaked the readme in #601

laptopmutia commented 1 year ago

@laptopmutia I have revisited this and tweaked the readme in #601

that is really make it clear for me thank you