pilagod / gorm-cursor-paginator

A paginator doing cursor-based pagination based on GORM
https://github.com/pilagod/gorm-cursor-paginator
MIT License
192 stars 44 forks source link

Using data inside JSON columns as cursors #42

Closed nikicc closed 2 years ago

nikicc commented 2 years ago

Question / issue description

AFAIK the pagination does not currently work over data fields inside JSON columns, correct?

Details

For example, let's say I have the following struct:

type Item struct {
    ID     string       `gorm:"primaryKey,column:id" json:"id"`
    Value  string       `gorm:"column:value" json:"value"`
    Number float64      `gorm:"column:number" json:"number"`
    Data   pgtype.JSONB `gorm:"column:data" json:"data"`
}

and let's say that Data column contains something like this:

{
    "key1": "val1",
    "key2": "val2",
}

Would it be possible to make the pagination sort the items based on the key1 descending? Notice that key1 is only a field inside the JSONB column (data), and not a separate column by itself.

My attempts

My first though was to use the SQLRepr attribute on the paginator.Rule struct. For example something like this:

paginator.Rule{
    SQLRepr:         "data #>> '{key1}'",
    Order:           "desc",
    NULLReplacement: "",
}

With this, I was able to sort the items on the first page, but I think that the cursor is not constructed in way to be able to support parsing values from JSONs and consequently when I jump to the second page, no items are returned.

I might have some time to work on this in the future, but wanted to reach our first to gather some ideas how we would make this work?

pilagod commented 2 years ago

Hi @nikicc, thanks for your comprehensive description and example. This is really a cool feature if we can support.

In fact, I hadn't ever used json field before, so it is not considered into this library design at first. So nice to have you to make this library able to evolve in a fantastic way 😎

I need some time to get familiar with this case to be able to give out feedbacks of any implementation thoughts.

Let's keep syncing here when we have any findings to push this feature forward 🧐

nikicc commented 2 years ago

Agreed, thanks for considering this!

pilagod commented 2 years ago

After some research, I found this problem is more complicate than I think. It is more about custom data types described in the gorm doc, and JSON type is a case of this.

I will share my thoughts of (1) current workaround and (2) support custom types

  1. Current workaround

    Since cursor encoder & decoder cannot identify types other than primitives, in order to paginate by nested columns, one way is to populate that column into top model.

    Using your case as example, in order to paginate by key1 under data field, you could define a read model for paginator like below to populate key1 to top model:

    type ItemForPaginating struct {
        Item
        Key1 string
    }
    
    func (ItemForPaginating) TableName() string {
        return "items"
    }

    Then you could configure paginator rule according to ItemForPaginating:

    paginator.Rule{
        Key:             "Key1"
        SQLRepr:         "data->>'key1'",
        Order:           "desc",
        NULLReplacement: "",
    }

    One caveat for this workaround is that you have to manually select key1 out in the select query to let GORM map its value to ItemForPaginating.Key1 for you:

    var items []ItemForPaginating 
    
    query := db
        .Table("items")
        .Select("id, value, number, data, data->>'key1' as key1")
    
    p.Paginate(query, &items)
  2. Support custom types

    It's a bit complicate to support custom types since arbitrary data format or structure would store underlying them, in the JSON example is the bytes[] data. How could cursor encoder & decoder automatically derive the right paging value from the model struct is a challenge.

    I only come up a rough idea that we might need to open new interface for developer to feedback right field value in the custom type to cursor encoder & decoder, which may look like:

    paginator.Rule{
        Key:             "Data"
        SQLRepr:         "data->>'key1'",    
        CustomTypeMeta:  ".key1"
    }

    And for the custom type, maybe we need an interface to convert the custom meta into meaningful value:

    type JSON pgtype.JSONB
    
    // GetValue a new interface to enable custom type for paginator
    func (j JSON) GetValue(meta: string) interface{} {
        // In this example `meta` would be ".key1"
        // The decoding & mapping logic should be implemented by custom type provider
    }

    Then at paginator side, when we detect field is custom type, we can use GetValue to let custom type provide us a right value to use in paginating:

    // cursor encoder line 34
    f := rv.FieldByName(key)
    // ...
    if e.isNilable(f) && f.IsZero() {
        fields[i] = nil
    } else if isCustomType(f) {
        // Just for presenting thought, not real code to work
        v := f.Interface().GetValue(e.getRule(key).CustomTypeMeta)
        fields[i] = util.ReflectValue(v).Interface()
    } else {
        fields[i] = util.ReflectValue(f).Interface()
    }

    This seems a not easy work to implement, and it will also make paginator interface more complicate to the client. I personally would prefer not to complicate the interface too much and keep it as simple as possible at current stage. 😵‍💫

Please feel free to let me know if you have any feedback or idea coming from the above cases 💪

P.S. I just mis-clicked the "close with comment" button, please ignore the log 🤦‍♂️🥲

nikicc commented 2 years ago

@pilagod thanks for all the research! This is super useful 😎

Since cursor encoder & decoder cannot identify types other than primitives, in order to paginate by nested columns, one way is to populate that column into top model.

Yes, this seems like a viable way. However, in this scenario you need to be aware of the structure in the JSON (i.e. you need to know beforehand the keys that are inside the JSON, key1 in this example) as this needs to be handled in the code in order to make it happen. Also, I'm afraid this does not scale well if you want to support pagination on multiple fields inside the JSON — as all of those would need to be defined on the top struct, which somehow defeats the purpose of using JSON in the first place (assuming the main reason for JSON, instead of separate columns, was the ability to support the structure of the data changing through time).

This seems a not easy work to implement, and it will also make paginator interface more complicate to the client. I personally would prefer not to complicate the interface too much and keep it as simple as possible at current stage. 😵‍💫

Complicating the client is a valid concern. If we would document this clearly to explain that CustomTypeMeta should only be used when paginating on custom types (e.g. JSON, arrays, etc), would you think it would still be problematic?

I think that the support for custom types could be a nice addition as that would allow the pagination across JSON columns more easily and "less hacky" than the current workaround. It also has the potential to be used for other types than JSON, one thing that comes to mind are arrays.

If I would take a stab at implementing something similar to the idea you described in Support custom types and open a PR, would you be open to considering & merging that?

pilagod commented 2 years ago

@nikicc Sorry for the late reply

You done so quickly!! That's impressive!! I agree with you that a clear documentation would be a useful tool to help user onboarding.

PR is always welcome and this is the charm of open source project 😎 I will come to understand the PR and give feedback to you as soon as possible.

pilagod commented 2 years ago

@nikicc I just added a new tag v2.3.0, you can try it now 🎉 🙌

nikicc commented 2 years ago

@pilagod amazing, thanks! Going to try immediately 🎉