Performance issues with large databases

christoff-linde commented 7 months ago

Description

I am back :) After #28, I can now successfully connect to my DB. However, I have noticed some performance degradation when the target DB has a lot of data.

For context, I have a TimescaleDB database for storing IOT sensor data. Currently, the table has around 25 000 entries, and is only growing by the hour.

I saw in the tauri code, that the data is being retrieved by doing a SELECT *, which does make sense for smaller amounts of data. Now I do realise that this might not be the exact "target market" / use case for TableX (to view data in DBs with large amounts of data), but it would still be cool to use it.

Now I must confess, I'm not super well versed in Rust, so at this stage I don't have a proposed solution, apart from ideas:

Perhaps the get_rows data retrieval function can somehow be paginated? I am not sure if SQLx even provides a way to do this. Even so, I assume it would require a bit of a rethink on the data retrieval queries/functions.
This could be applied to the frontend as well, as the frontend seems to struggle with such large amounts of data. Perhaps some form of virtualisation in the rendering of the table rows could be a possible solution? Or at the very least, having paginated data being retrieved from Tauri could also help with initial performance concerns.

I am definitely keen to helping come up with solutions/doing research on possible solutions, so let me know :) Or, if this is out of scope of your vision for TableX, that is also (obviously) perfectly fine :)

For debugging purposes, here is my docker-compose.yml setup:

version: "3"
  db:
    container_name: pih-rs-db
    image: timescale/timescaledb:latest-pg16
    environment:
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    ports:
      - "5439:5432"
    restart: always
    volumes:
      - db:/var/lib/postgresql/data

volumes:
  db:
    driver: local

I have a (still very, very incomplete) Axum project that uses the docker setup mentioned above. If you want/need more information on the DB schemas & structure of the tables to help with creating/setting up a similar DB setup, you can check the initial migration file

kareemmahlees commented 7 months ago

Welcome back! 🥳

Thank you again for your descriptive and informative issues ⭐

I anticipated this issue because of ( as you have figured out ) the naive way of handling getting rows, so I was planning to handle this because I think of TableX handling all kinds of DB load ( i.e. big or small data ).

And you had the same thoughts and solutions that came up to me while I was thinking about it!

Let me break down how I think of it:

Frontend
- As you have mentioned, virtualizing the table view will greatly help with the performance, so I was planning to use Tanstack Virtual for this. ( because we already use the Query & Table so why not :) )
- I am also thinking of implementing Infinite Scrolling instead of pagination for now, we already use Tanstack Query which I think has some docs about Infinite Scrolling. This step though will require some changes in the backend, listed below.
Backend
- the get_rows function should be get_rows_paginated taking page as an input in order to work with Infinite Scrolling. This change will help us both in the short-term and long-term because in the future I am planning, hopefully, that we implement settings where the user can specify whether he wants pagination+ specifying the limit number of the page or Infinite Scrolling

This change will be quite some work so I might split it into multiple PRs or just smash it into a single one idk yet.

I will keep the issue updated, you are obviously more than welcome if you would like to help in any part or if you would like to continue playing around with TableX and reporting the bugs you encounter. 😄

christoff-linde commented 7 months ago

Hmmm, that all sounds like a solid plan.

I would be keen to try and help out with code if you decide to split up the work into different tasks/PRs. So I will keep an eye on this issue and the repo in general and then will let you know if I'm planning to pick anything up :)

Otherwise I'm happy to just help test stuff :)

christoff-linde commented 7 months ago

Something that could be cool could be to have similar behaviour to JetBrains DataGrip, where the results are paginated by default, and you can navigate through the different pages

kareemmahlees commented 7 months ago

Yoo! Sorry for the delay, been a little busy.

I implemented Pagination and Virtualization 🎉. Now it should feel smoother, I have pushed a minor release 🚀

christoff-linde commented 7 months ago

Yoo! Sorry for the delay, been a little busy.

I implemented Pagination and Virtualization 🎉. Now it should feel smoother, I have pushed a minor release 🚀

Insanely cool. It works flawlessly! Thanks for the awesome updates

kareemmahlees / tablex

Performance issues with large databases #33

Description