Closed erlingur closed 5 months ago
Thanks for your report.
I was trying to reproduce it with your chosen database and with the command specified:
`Database 'pagy_repro' does not exist. Make sure that the name is entered correctly. (TinyTds::Error)```
if you have the code handy, adding it in the above lines would help inordinately?
I've tried my best to make this as easily reproducible but because of the nature of the issue this is the best I could come up with.
You certainly did @erlingur ! Thank you for your report.
That commits should have improved the performance of a probably quite rare case, and it has been added thinking that it wouldn't hurt anything else.
Unluckily that is almost never the case š¤·.
At worse we will switch back to pass the items
and add a troubleshooting entry explaining how to improve the performance by overriding, only when really needed.
BTW, I didn't try to run your command, so you may still want to help @benkoshy to reproduce it.
Thank you
@jules-w2
Hey @ddnexus
I'm not sur it's a "rare" case. the case occurs on the last page when there are millions of entries in the database and the "limit" is greater than the results which remain to be displayed
I think it would be easy to solve all cases with:
collection.offset(pagy.offset).limit(pagy.in >Ā 0 ? pagy.in : pagy.items )
Bests
@joules-w2 about the rarity of the case: in practice we have no idea one way or another.
But my educated guess would suggest that it could be your own setup, app, DB type or settings or query, in combination with who knows what... even a bug in the particular DB version/driver/ORM. Because one thing is certain: there have been no other complaint EVER in 7 years so far.
And when there are so many possible combinations, no other complaints ever, no objective reproducible culprit that we can pin down, I am comfortable to consider that it IS a rare case until proven common.
So adding a condition for every request, of every app, forever! and without any objective measurable reason nor proof, also considering that the benefit is only to improve the performance of the very last page in DBs with zillions of records and the right condition and star alignments... it's quite against my religion š¤·
Of course, if you can prove that the change is worth it, I will be happy to change my mind.
@benkoshy @ddnexus Thank you for taking a look! I've update the rails.ru
file so it should now create the database for you if it doesn't exist. It works on my machine so hopefully it should work on yours :)
@ddnexus
as soon as I find free time, I will look in depth at the origin of the slowness to understand exactly what is happening
Thank you for taking a look! I've update the
rails.ru
file so it should now create the database for you if it doesn't exist. It works on my machine so hopefully it should work on yours :)
Excellent: I can confirm reproduction of the issue.
DEBUG -- : (1.5ms) IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION
INFO -- : Started GET "/" for 127.0.0.1 at 2024-05-24 06:06:23 +1000
DEBUG -- : Comment Count (1.4ms) SELECT COUNT(*) FROM [comments]
DEBUG -- : Comment Load (2.5ms) EXEC sp_executesql N'SELECT [comments].* FROM [comments] ORDER BY [comments].[id] ASC OFFSET @0 ROWS FETCH NEXT @1 ROWS ONLY', N'@0 int, @1 int', @0 = 0, @1 = 0 [["OFFSET", nil], ["LIMIT", nil]]
F, [2024-05-24T06:06:23.977388 #405730] FATAL -- :
ActionView::Template::Error (TinyTds::Error: The number of rows provided for a FETCH clause must be greater then zero.):
54:
55: <h3>Collection</h3>
56: <div id="records" class="collection">
57: <% @comments.each do |comment| %>
58: <p style="margin: 0;"><%= comment.body %></p>
59: <% end %>
60: </div>
Thank you @benkoshy
In order to avoid any surprise, pease could you check also the other way around in the same environment, i.e. changing pagy.in
back to pagy.items
and trying the exact same query?
Thank you @benkoshy
In order to avoid any surprise, pease could you check also the other way around in the same environment, i.e. changing
pagy.in
back topagy.items
and trying the exact same query?
As correctly predicted: using items
simply works: even though there are no items in the collection. (I tested via overridding in the script rather than referencing a fork of the gem with pagy.items
)
# Controllers
class CommentsController < ActionController::Base # :nodoc:
include Rails.application.routes.url_helpers
include Pagy::Backend
def index
@pagy, @comments = pagy(Comment.all)
render inline: TEMPLATE
end
def pagy_get_items(collection, pagy)
collection.offset(pagy.offset).limit(pagy.items)
end
end
DEBUG -- : Comment Count (2.2ms) SELECT COUNT(*) FROM [comments]
DEBUG -- : Comment Load (2.8ms) EXEC sp_executesql N'SELECT [comments].* FROM [comments] ORDER BY [comments].[id] ASC OFFSET @0 ROWS FETCH NEXT @1 ROWS ONLY', N'@0 int, @1 int', @0 = 0, @1 = 10 [["OFFSET", nil], ["LIMIT", nil]]
127.0.0.1 - - [24/May/2024:08:51:54 +1000] "GET / HTTP/1.1" 200 - 6.4252
I, [2024-05-24T08:51:54.192835 #440730] INFO -- : Started GET "/javascript" for 127.0.0.1 at 2024-05-24 08:51:54 +1000
127.0.0.1 - - [24/May/2024:08:51:54 +1000] "GET /javascript HTTP/1.1" 200 - 0.0550
Thank you @benkoshy.
At this point, while waiting to know more about the actual cause of the last page delay in @jules-w2 case, we should go back to pagy.items
.
We should also create a new entry in the troubleshooting page, referencing the case and the overriding.
Hey @ddnexus
After further investigation, I found that this issue does not originate from Rails, Ransack, Pagy, etc., but directly from MySQL. You should be able to reproduce it on your side easily if you have a table with a large amount of data.
I am working with MariaDB 11.2.3:
SELECT VERSION();
+----------------+
| VERSION() |
+----------------+
| 11.2.3-MariaDB |
+----------------+
1 row in set (0.000 sec)
I have 14,553 results for my query:
SELECT COUNT(*) FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00');
+----------+
| COUNT(*) |
+----------+
| 14553 |
+----------+
1 row in set (0.380 sec)
With limit = items.in (in this case 3)
SELECT DISTINCT * FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 14550;
3 rows in set (0.377 sec)
The same code with limit 4
SELECT DISTINCT * FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00') ORDER BY `aws_emails`.`id` DESC LIMIT 4 OFFSET 14550;
3 rows in set (51.727 sec)
@ddnexus are you able to reproduce this case ?
I found the reason :
"MariaDB has to find all [OFFSET+LIMIT] rows, step over the first [OFFSET], then deliver the [LIMIT] for that distant page."
https://emmer.dev/blog/the-dangers-of-offset-with-mysql https://mariadb.com/kb/en/pagination-optimization/#the-problem https://planetscale.com/blog/mysql-pagination https://github.com/planetscale/fast_page?tab=readme-ov-file#pagy
@jules-w2 great references!
I preliminarily read it quickly and skipping around, and I may have missed something, but your specific case seems to be different (and more extreme) than the usual known progressive performance hit on large tables.
You have a huge difference only on the last page that doesn't seem to be related to the OFFSET as much as related to the LIMIT.
I see the charts in the first link you posted and that is very much a constant increase, related to the need to skip over a huge number of rows (OFFSET), not a huge spike on the very last page, which in your case can be totally removed with the right LIMIT.
Did I miss something?
BTW, could you try the performance by following https://ddnexus.github.io/pagy/docs/how-to/#consider-the-arel-extra-and-or-the-fast-page-gem ?
@jules-w2 please could you try the query on the last page with the usual OFFSET but without the LIMIT?
And just a confirmation: are you running the query in the MySQL console?
@ddnexus
The problem when looking for [OFFSET+LIMIT] is that if [OFFSET + LIMIT] is greater than the total number of rows, MySQL will look for rows that do not exist.. (and we must enter internal timeouts of MySQL)
This is the typical case which arrives on the last page.
I run the query in MySQL console just to understand the problem. In practice I use ActiveRecord + Ransasck + Pagy.
I will test the 2 recommendations and I will come back to you with the results
MySQL console just to understand the problem
and that is perfect because it excludes any possible problem with the rest of the environment
I will test the 2 recommendations and I will come back to you with the results
They are together but I think that fastpage is probably the one that could make some difference, but I am just guessing. It would be wise to try one, the other and both together (if possible).
MySQL will look for rows that do not exist..
A couple of things to unpack here:
I didn't find any description of any (OFFSET+LIMIT) > COUNT
problem reaching "rows that do not exist" in any of your links. Did I miss something? They all describe just the usual performance hit with skipping over many rows, so I guess that explaining your case with "looking for rows that don't exist" is your own deduction (which seems reasonable, but yet unconfirmed).
It would be very "out of character" for a DB to be so dumb and unaware of the count of a table to get stuck on the last few rows for 1 minute, searching for rows that don't exist. Usually a DB should be pretty resilient to what a query may ask for, especially about data that "don't exist". I am starting to believe that it might be a DB bug.
Whatever is the cause of the problem, at this point it seems quite established that it's external to ruby and pagy, so documenting any useful workaround should be more than enough for the pagy scope.
š Before submitting...
š§ REQUIREMENTS
š¬ Description
Hi there, I noticed an issue that a very recent commit just introduced when using Pagy with a MS SQL Server. Passing a
0
into.limit()
does not work with MS SQL server. It will give the following error:The number of rows provided for a FETCH clause must be greater then zero
.This is because MS SQL server does not support
FETCH NEXT 0 ROWS ONLY
.Docs: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql?view=sql-server-ver16
Quote:
Here is a Rackup file as requested that demonstrates the issue. I can't spin up a SQL server for you in this file, sorry. You can launch it with a Docker command:
sudo docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Password.1" -p 1433:1433 --name sql1 --hostname sql1 -d mcr.microsoft.com/mssql/server:2022-latest
rails.ru:
I've tried my best to make this as easily reproducible but because of the nature of the issue this is the best I could come up with. Hope it works.