electric-sql / electric

Sync little subsets of your Postgres data into local apps and services.
https://electric-sql.com
Apache License 2.0
6.48k stars 156 forks source link

example: 100mb of analytics data in pglite #1534

Open KyleAMathews opened 3 months ago

KyleAMathews commented 3 months ago

narrative:

msfstef commented 2 months ago

Would this be like a table/graph frontend? How would the story go other than "useShape to sync data in, the rest is all SQL/Postgres in the client"?

KyleAMathews commented 2 months ago

Yeah some sort of table graph frontend where the user can change some options or whatever so triggers a new client-side query/render. The goal is to show a) 100mb is actually a reasonable amount of data to load and b) client-side interactivity is still really fast.

But the UI itself is whatever strikes your fancy.

msfstef commented 2 months ago

@KyleAMathews as previously discussed, should we morph this into a more "live streaming HackerNews data using Electric and PGlite?" rather than a plain one-time analytics workload?

KyleAMathews commented 2 months ago

I like the idea of something live streaming but the main point here is to show that the sync engine & pglite can easily load large amounts of data. Live streaming is nice but not the point.

A HN clone doesn't necessarily show that but it could e.g. if you include some charts about posts / month over the past 10 years or whatever that shows off an aggregation query. So yeah an aggregation query + full-text search could be fun (though perhaps creating the full-text index would be excessively slow?).

What do you think?

msfstef commented 2 months ago

Hmm if the main point is still the loading of large amounts of data then I think it's better to stick with the thing I've already been building (large fixed dataset) and do something with that - probably no point in reworking it for a different data source. We can think of different examples to highlight the live streaming analytics use case.

KyleAMathews commented 2 months ago

👍 yeah, good to keep examples focused

balegas commented 2 months ago

It's easy to find large dataset from HN and GH which would allow to do some nice queries on loads of data that interests the community. The problem with these dataset is that they are all a bit old and therefore there is a gap between the collected data and the data that we would be receiving live.

If we're looking into synching large amounts of data I'd still look into one very large dataset. In the order of GB (or the max the browser allows).

As discussed, what's special about analytics with Electric is the ability to continue getting live data for your analytics dashboard. I'll create a separate issue for that.

balegas commented 2 months ago

https://github.com/electric-sql/electric/issues/1745