Closed dhalperi closed 9 years ago
Yeah, this was a pain.
Can't we add a row number?
SELECT
( SELECT COUNT(*) + 1
FROM "queries"
WHERE id < t.id AND [condition]
) as rowIndex,
query, [columns]
FROM "queries"
WHERE [condition]
ORDER BY id;
whew, I think that nested subquery has bad complexity... effectively O(n^2)?
Daniel Halperin Director of Research for Scalable Data Analytics eScience Institute University of Washington
On Tue, Dec 9, 2014 at 11:02 AM, Dominik Moritz notifications@github.com wrote:
Yeah, this was a pain https://github.com/uwescience/myria-web/blob/master/appengine/myria_web_main.py#L326.
Can't we add a row number?
SELECT ( SELECT COUNT(*) + 1 FROM "queries" WHERE id < t.id AND [condition] ) as rowIndex, query, [columns]FROM "queries"WHERE [condition]ORDER BY id;
— Reply to this email directly or view it on GitHub https://github.com/uwescience/myria-web/issues/232#issuecomment-66337460 .
Or add pagination information to the json response... Something like
{
"max": 42,
"limit": 10,
"numQueries": 154,
"queries": [
...
]
}
@domoritz the key point is that given that information I still can't jump to an arbitrary page. From those numbers, I'm on at least the 11th page (there are 16 pages total with 154 results and 10/page), and there are at most 42 queries "after" the current max. How do I know what query ID starts page 10? I know it's some number >= 52, but I can't tell any more than that. FWICT.
Max in this case is the result, not the query id. Which is not great because the page may change if new queries come in.
Btw, for the previous suggestion, see github API:
> GET /search/code?q=addClass+user:mozilla HTTP/1.1
> User-Agent: curl/7.37.1
> Host: api.github.com
> Accept: */*
>
< HTTP/1.1 200 OK
* Server GitHub.com is not blacklisted
< Server: GitHub.com
< Date: Tue, 09 Dec 2014 19:10:48 GMT
< Content-Type: application/json; charset=utf-8
< Status: 200 OK
< X-RateLimit-Limit: 10
< X-RateLimit-Remaining: 8
< X-RateLimit-Reset: 1418152288
< Cache-Control: no-cache
< X-GitHub-Media-Type: github.v3
< Link: <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=2>; rel="next", <https://api.github.com/search/code?q=addClass+user%3Amozilla&page=34>; rel="last"
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: deny
< Content-Security-Policy: default-src 'none'
< Content-Length: 157484
< Access-Control-Allow-Credentials: true
< Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
< Access-Control-Allow-Origin: *
< X-GitHub-Request-Id: 80D00769:6714:1402960:54874937
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Content-Type-Options: nosniff
< Vary: Accept-Encoding
< X-Served-By: 474556b853193c38f1b14328ce2d1b7d
<
{
"total_count": 1970,
"incomplete_results": false,
"items": [
{}
]
}
Okay, I'm fine with no jumping for search results. Just forward and backwards.
What about backwards pagination? This way the results for a page stay the same. However, it's impossible to jump to a page with a specific query and see what happened before and after it.
I'm still for pagination without jumping for search results.
Looking at GitHub, it looks like they use page=P
to indicate which page of the results you get, which would have the results-changing issue. Not a huge deal, but I get it.
Backwards pagination: harder to compute, but doable! I'll think about it.
I only know how to do forwards (earlier queries) pagination with search.
Consider a hypothetical result set (just the query ids): for this query: ?q=twitter&limit=10&max=1000
[997, 944, 10, 9, 8, 7, 6, 5, 4, 3]
The forwards pagination is: ?q=twitter&limit=10&max=2
because 2 is the smallest query id in the result set (3
) minus 1.
The backwards pagination is not computable here. So we could have a next
arrow, but the back button would be a browser-level back?
See #237 for the resolution.
FYI: @domoritz @stechu
Right now we paginate the queries using
N
andP
--N
is the ID of the last query, andP
is the number of results per page. We can figure out the query ID range of pagei
by[N-P*(i-1), N-P*i)
.However, this will not work for filtered results, e.g., "only queries with
twitter
in the program". What should we do instead?P
results with no customizationnext
-- where we look for the nextP
queries starting atmax = min(current result set query id)-1
but no back or jumping.use SQL to actually give us the
[i*P, (i-1)*P)
results. E.g., something likeOpinions?