morlandi / django-ajax-datatable

A Django app which provides the integration of a Django project with the jQuery Javascript library DataTables.net
MIT License
204 stars 64 forks source link

Is it correct for get_initial_queryset getting invoked more than once? #124

Closed lmaloney closed 5 months ago

lmaloney commented 5 months ago

I have a table, with multiple filter options using the standard built in filter bar, and i'm trying to track down a performance issue.

I have overwritten get_initial_queryset, and during debug, I noticed that get_initial_queryset is getting called multiple times for the same table when i load or refresh the page.

get_initial_queryset is getting invoved ( not by. me) once for every single filter option/column i have in the table.

Is this expected behavior, or am i doing something wrong?

morlandi commented 5 months ago

I think this is to be expected.

We are delegating the data extraction entirely to the server, which must execute a query each time based on the value of the parameters received from the request. This is what the database does well.

Since pagination is used, with correctly set indexes this should never be a problem. It seems more like a database problem to me

lmaloney commented 5 months ago

Yes, I see.

Well, I assumed doing a single query for the data set would be enough, and I also assumed you would be doing a limit offset... but anyway.

PS: thank you for the response! :)

I am able to evade my performance issue now.

The query i was running, was getting the entire data set, of like 200K records. realistically, i don't need that much. So, i changed the app, to have scheme where the user gets assigned a set of records, so in this case, they get 1,000 records. If they need more, they can get more, but the point is, instead of the entire data set, they get a subset, they can realizistically utilize.

it's good enough for now, BUT.. I would encourage you to consider your pagination, or whatever... to make a single query for the dataset.... if you have control over that. The server indeed, can/should be breaking up the results in a paging schema, that matches the output of the dbase cursor. However, the CLIENT should only need to initiate ( get initial queryset) once

morlandi commented 5 months ago

Sorry Larry, I would not agree.

Why should I remember an entire table when the client is only asking for one hundred records (at a time)? Where should I store the initial "big" queryset waiting for further requests (if any)? and for how long?

You don't need all this. HTTP is a stateless protocol: the server does not maintain information on the requests made by the client nor on the context. It is the client that will then provide in subsequent requests all the information useful for identifying the data to be collected: the new page number, filter value, etc...

With an appropriate database design, response times will depend almost exclusively on the size of the resulting page, and very little on the size of the original source table.

I suspect that you are applying on order by on a column without indexes and that this causes a sequential scan over the entire table before producing the result.

If you are using Postgresql, you can verify the query's production plan using the EXPLAIN statement:

https://www.postgresql.org/docs/current/sql-explain.html