supabase / postgrest-js

Isomorphic JavaScript client for PostgREST.
https://supabase.com
MIT License
1.06k stars 140 forks source link

feat: Support COUNT of relevant rows #94

Closed soedirgo closed 3 years ago

soedirgo commented 4 years ago

Feature request

Is your feature request related to a problem? Please describe.

I'd like to perform COUNT(*) on relevant rows.

Describe the solution you'd like

A PostgrestTransformBuilder method. PostgREST supports this.

stupidawesome commented 4 years ago

Postgrest reports counts in the Content-Range header. This has to be requested by setting Prefer: count={strategy} where {strategy} is one of exact, planned or estimated.

range() could be modified to accept a count option as this is useful for displaying search results with a paginator.

range(
        from: number,
        to: number,
        {
            count,
            foreignTable,
        }: {
            foreignTable?: string
            count?: "exact" | "estimated" | "planned"
        } = {},
    ): PostgrestTransformBuilder<T> {
        const keyOffset =
            typeof foreignTable === "undefined"
                ? "offset"
                : `"${foreignTable}".offset`
        const keyLimit =
            typeof foreignTable === "undefined"
                ? "limit"
                : `"${foreignTable}".limit`
        this.url.searchParams.set(keyOffset, `${from}`)
        // Range is inclusive, so add 1
        this.url.searchParams.set(keyLimit, `${to - from + 1}`)
        if (count) {
            this.headers["Prefer"] = `count=${count}`
        }
        return this
    }

Then in the fetch promise could return this as part of the response:

const contentRange = res.headers.get("Content-Range")
if (contentRange) {
    const parts = contentRange.split(/[-\/]/)
    // Note: this doesn't not handle responses like `0/*`
    range = {
        from: parseInt(parts[0], 10) + 1, // zero-based index
        to: parseInt(parts[1], 10) + 1, // zero-based index
        total: parseInt(parts[2], 10),
    }
}

return {
    range,
    data,
    status,
    ...etc
}

This would involve modifying the response format

soedirgo commented 4 years ago

Hey @stupidawesome! This issue is for getting just the COUNT using a HEAD request, as laid out here.

I'd like to keep the library simple, so we don't plan to expose the HTTP response headers.

dshukertjr commented 3 years ago

Hi! First of all, thank you for creating such an amazing product. Supabase is what exactly I was looking for.

Is this issue moving forward? I would like to have a count feature in Supabase. If it's pending right now, I was thinking I could tackle this issue.

soedirgo commented 3 years ago

Thanks for hopping in @dshukertjr, this feature is long overdue...

I'm starting to think that it might be worth modifying the response format, adding a count field. I can't see any way of supporting this otherwise.

There are two problems here:

  1. getting a COUNT(*) with a HEAD request (no data), and
  2. getting a count of the selected/modified set of rows.

The solution I'm thinking of is similar to how we now handle select: (1) is done in PostgrestQueryBuilder (i.e. like select(), insert(), ...), while (2) is done in PostgrestTransformBuilder (i.e. like order(), range(), ...). So to do PostgREST's HEAD request we do:

const { count } = await postgrest
    .from('table')
    .count('exact') // or 'planned', or 'estimated'. What should be the default here?
    .gt('id', 10)

And to get a count of the selected/modified rows (this is useful for our table editor pagination) we do:

const { error, data, count } = await postgrest
    .from('table')
    .select()
    .gt('id', 10)
    .count('exact')

I'm not sure if we need from and to, I can't think of a case when these can't be easily computed with count and from (that the user specifies in range()).

Would love to hear your thoughts @kiwicopple @steve-chavez @thorwebdev

kiwicopple commented 3 years ago

Just checking - the count() function is simply going to pluck the value out of the header right?

soedirgo commented 3 years ago

Yup, it just uses Content-Range, no additional query done.

kiwicopple commented 3 years ago

OK cool. IMO it seems redundant to add a whole function when it's still fetching the results from the database just for the headers. This one seems a natural extension of select()

const { data, error, range, count } = supabase
  .from('table')
  .select('*', { count: 'exact' })

console.log(range) // [0, 24] (?)
console.log(count) // 1000

Does that work?

// or 'planned', or 'estimated'. What should be the default here?

It's a good question. We're seeing a lot of people confused by our "row counts" in the dashboard becuase they are estimates on the tuples. Even though it's slower, I think the principle of least surprise applies here - we should use exact as a default.

Once again this is breaking the PostgREST defaults, but we are seeing that people are coming in with small databases, then growing. In the time that it takes for them to grow their database, they can learn about performance optimizations like using planned. We could also add it as a param in the constructor which can be overridden.

wdyt?

steve-chavez commented 3 years ago

const { error, data, count } = await postgrest .from('table') .select() .gt('id', 10) .count('exact')

I like the extra count field in the response since it would work for every operation.

Exact count is supported for insert/delete/update/rpc. And planned plus estimated might come to RPC. So the interface seems fit.

const { count } = await postgrest .from('table') .count('exact') // or 'planned', or 'estimated'. What should be the default here? .gt('id', 10)

The problem with that interface is that it doesn't specify the operation. It assumes count only works for GET.

For HEAD, how about having an extra modifier that nulls the data and error fields?

const { error, data, count } = await postgrest
    .from('table')
    .select()
    .gt('id', 10)
    .count('exact')
    .head()

console.log(error, data, count) // null null 100
steve-chavez commented 3 years ago

For HEAD, how about having an extra modifier that nulls the data and error fields?

Actually scratch that, HEAD only makes sense for select and rpc.

How about having head as parameters of both:

// for select
const { data, error, count } = postgrest
  .from('table')
  .select('*', { head: true })

// for rpc
const {data, error, count} = postgrest
  .rpc('get_status', { name_param: 'supabot' }, {head: true})

Both would null the data and error fields.

we should use exact as a default.

Agree with exact as default.

soedirgo commented 3 years ago

This one seems a natural extension of select()

Makes sense, since we already have a bunch of Prefer-related stuff in each respective CRUD method. Though with that approach, if we want exact to be the default, maybe we should also allow null for if we don't need the count? If the count is not very expensive this might be OK.

HEAD only makes sense for select and rpc.

Didn't know you can do HEAD on rpc. Sounds good to me.

kiwicopple commented 3 years ago

yeah head is cool! Agree with this approach 👍

dshukertjr commented 3 years ago

Correct me if I am talking gibberish, but with this method, would you be able to query count value of joined table?

For example, let's say you are developing a twitter clone, and you want to retrieve a users profile as well as the tweet count of the user to display them on a profile page. Something along the line of

const { data, error } = await postgrest
    .from('users')
    .select(`
        name,
        tweets (
          count(*) as count
        )
  `)

I guess I would have to send two request to achieve this huh? One to get the user profile, and one to get the tweet count.

dshukertjr commented 3 years ago

Ignoring what I am saying about counts from foreign table, to summarize the discussion in this thread, it will look like this.

Getting count along with table data:

const { data, error, count } = await postgrest
    .from('table')
    .select('*')  
    .gt('id', 10)
    .count() // 'exact' by default. 'estimate', 'planned' and null are also accepted

Only getting the count:

const { data, error, count } = await postgrest
    .from('table')
    .select('*', {head: true})  
    .gt('id', 10)
    .count() // 'exact' by default. 'estimate', 'planned' and null are also accepted

console.log(data, error, count) // data and error are null. count contains row count. 

Do I understand it correctly?

steve-chavez commented 3 years ago

Correct me if I am talking gibberish, but with this method, would you be able to query count value of joined table?

@dshukertjr No, this interface is just for counting the root table.

I guess I would have to send two request to achieve this huh?

Not necessarily, you can always create a VIEW or FUNCTION that has the aggregate to do it in one request.

There's also another option, that is currently undocumented in postgrest since it's not complete. Basically:

const { data, error } = await postgrest
    .from('users')
    .select(`
        name,
        tweets (
          count
        )
  `)

That will get the tweets count as you'd expect. However if you add more attributes(like tweets(content,date,count)) it will fail.

Calling other aggregate functions with group by is a future enhancement.

dshukertjr commented 3 years ago

Thanks @steve-chavez for pointing me to the right direction.

Regarding this post, does this look like what everyone had in mind? I could try to write a pull request for this.

soedirgo commented 3 years ago

That'd be awesome @dshukertjr! 👍

The only change is to make count an option on select(), insert(), update(), delete(), and rpc() instead of a function, mostly because other similar stuff are also done as options (returning, onConflict).

steve-chavez commented 3 years ago

I could try to write a pull request for this.

@dshukertjr Great! Go ahead!

In case it helps, here's an example of joining both the head and count ideas:

const { data, error, count } = postgrest
  .from('table')
  .select('*', { head: true, count: 'exact' })

Also agree with soedirgo's idea about count being null by default. There's a perf loss(16% loss in throughput with this setup) when adding count= exact. It would be wasteful to have it by default.

dshukertjr commented 3 years ago

Thanks @soedirgo and @steve-chavez. I will add this feature and create a pull request.

dshukertjr commented 3 years ago

@steve-chavez I guess I am a bit confused with what the head option does.

I do understand that it will turn data and error null, but how would it look like behind the scenes? Would postgrest-js still fetch the data and just not return the response body as data, or is there a flag or something to not get the response body to begin with?

steve-chavez commented 3 years ago

@dshukertjr No, it would execute an HTTP HEAD request.

See how a GET method is specified here. You'd have to change that to HEAD conditionally.

dshukertjr commented 3 years ago

@steve-chavez Got it! Thank you!

dshukertjr commented 3 years ago

I may have a question regarding this.

Having head and count as an option for select makes sense to me, but having those options for all the other methods listed do not make sense to me. Could you provide a simple example situation of when you would want to use head or count options for method other than select()?

steve-chavez commented 3 years ago

@dshukertjr For RPC, there are set returning functions(return multiple rows) that can be called. A simple one:

create function getallprojects() returns setof projects as $$
select * from test.projects;
$$ language sql;

Using count there makes the same sense as using it for a select. Also head serves the same need(only interested in the count and not the body).

For POST(insert), you'd be interested in the count when bulk inserting many rows.

For UPDATE/DELETE, the count is useful for knowing how many rows you updated/deleted in one request.

dshukertjr commented 3 years ago

@steve-chavez Thanks for the clarification! I will start implementing count and head to other methods once I implement the tests for select() in the pull request!

soedirgo commented 3 years ago

@steve-chavez head is only for select() and rpc() though right? For the others you'd use returning: minimal instead.

steve-chavez commented 3 years ago

@soedirgo Yes. Now that you mention it, returning:minimal is like HEAD for insert/update/delete.

I will start implementing count and head to other methods

@dshukertjr head only for select and rpc :-)

dshukertjr commented 3 years ago

Okay, to summarize, it sounds like head for select and rpc, count for everything. I will get to it!

soedirgo commented 3 years ago

Closed by #147 🚀

kiwicopple commented 3 years ago

Amazing job @dshukertjr 🚀

dshukertjr commented 3 years ago

@kiwicopple Thank you so much! I'm glad that I was able to contribute!

johndpope commented 2 years ago

https://supabase.com/docs/reference/javascript/select

const { data, error, count } = await supabase
  .from('cities')
  .select('name', { count: 'exact' }) 
lauridskern commented 1 year ago

is there a way to use this with filtering? eg. only counting rows that have a specific user id

johndpope commented 1 year ago

should be fine. just tack on .eq("user_id",1) at the end