nelsonic commented 1 year ago

We’ve gone long enough without creating lists. In order to unlock the next level of usefulness, we are going to need lists. This was briefly outlined in https://github.com/dwyl/app/issues/271 but I'm opening this issue for implementation in the MVP. i.e. this is the basic first implementation. 👌

list requirements:

[x] When a person first authenticates with the App, automatically create their list called “All”
- [x] ALL items are added to this list by default. ~~Should the setting to automatically add ALL items to the “Everything List” should be configurable?~~ ~~person can create a new list (e.g. “Backlog”) and use that the place where all new items are stored.~~
  
  Note: list creation interface is still TBD. #helpwanted
[x] lists should have the following fields:
- [x] id (Int auto increment PK) of the list used for lookups and routing but not displayed to the person
- [x] text (String) - the name of the list that will be displayed in the App
- [x] person_id (Int) of the person that created the list
- [x] status (Int) a numeric status applied to the list, see: https://github.com/dwyl/statuses
[x] A list can have an arbitrary number of items (zero or more) via the list_items lookup table.
[x] items can be removed from a list completely. Rather than adding an extra (status) field to the list_items table that will be null 99% of the time. We simply use the position field and a special value: 999999.999 AKA “nine nines” (a nod to the 99.9999999% High Availability But in our case we want number to be ridiculously high, 999999.999 rounds to a Million. We cannot use 99.9999999 because we expect many (most?) “Everything” lists to exceed 100.
[x] So to “remove” an item from a list, simply insert a new row into list_items with position=999999.999

Note: we do not expect any real lists to have a Million items and if they do we can revisit this. A list with a Million items would be totally unmanageable by a human on a screen … If we get to the point where lists are being used by machines, we will revisit this.

Ordering

[x] The order of the list_items is represented by the position field, as per: https://github.com/dwyl/mvp/issues/145#issuecomment-1492132575
[x] When the person performs a re-order operation this inserts a new row into the list_items table thus preserving the full history of the list, This will ensure that people can replay changes made to a list e.g. if they are working in a remote team and want to see what changes were made by a team member.

nelsonic commented 1 year ago

MVP ERD before lists:

Without the migrations and versions noise:

With lists and list_items

nelsonic commented 1 year ago

Need a way of defining the current active list ... 📝

nelsonic commented 1 year ago

Default lists:

really wish pgweb had navigation; the http://localhost:8081/# URL never changes no matter what you're viewing. 🙄 Feature Request Reported: https://github.com/sosedoff/pgweb/issues/625#issuecomment-1667752348 🤞

@ndrean hopefully this answers your question https://github.com/dwyl/learn-postgresql/issues/94#issuecomment-1665989044 😉 But if you've found another Open Source + Fast startup time + Easy to extend (i.e. not React.js) let us know! 🙏

ndrean commented 1 year ago

@nelsonic You can run livebook server ? and run Postgrex queries ? (if Postgres server is running and open on 5432)

Mix.install([
  {:kino_db, "~> 0.2.1"},
  {:postgrex, "~> 0.16.3"}
])

opts = [
  hostname: "localhost",
  port: 5432,
  username: "postgres",
  password: System.fetch_env!("LB_PASSWORD"),
  database: "my_db_dev"
]

{:ok, conn} = Kino.start_child({Postgrex, options})
Postgrex.query!(conn, "select * from users;", [])

nelsonic commented 1 year ago

Sadly/frustratingly, I've been wrestling with Postgres for the past day trying to write a fairly basic query:

SELECT id
FROM list_items
WHERE person_id=2
GROUP BY list_id, item_id

`ERROR: pq: column "list_items.id" must appear in the GROUP BY clause or be used in an aggregate function`

This is the simplest GROUP BY query I could think of:

SELECT id, list_id, item_id
FROM list_items
GROUP BY (list_id, item_id)

Still the same error. 😢

So I read: https://learnsql.com/blog/must-appear-in-group-by-clause/

And tried:

SELECT list_id, item_id, COUNT(*)
FROM list_items
GROUP BY list_id, item_id
ORDER BY count DESC

Which works:

But the moment I try to do anything more advanced e.g:

SELECT list_id, item_id, COUNT(*), position
FROM list_items
GROUP BY list_id, item_id
ORDER BY count DESC

So that I can view the position of in the list I'm back to the same error:

I feel like I'm fighting against a fundamental flaw in Postgres that doesn't allow me to write the query I want. 😢

nelsonic commented 1 year ago

After reading: https://stackoverflow.com/questions/19601948/must-appear-in-the-group-by-clause-or-be-used-in-an-aggregate-function

Which gives the example:

SELECT DISTINCT ON (cname) 
    cname, wmname, avg
FROM 
    makerar 
ORDER BY 
    cname, avg DESC ;

I tried:

SELECT DISTINCT ON (list_id, item_id)
  list_id, item_id, position
FROM list_items

Which works:

But the moment I try to ORDER BY a column e.g: position I'm back to a similar error:

`ERROR: pq: SELECT DISTINCT ON expressions must match initial ORDER BY expressions`

This is a pretty fundamental limitation and doesn't make any sense to me. 😢

I can write a Map |> Reduce in Elixir to achieve the result I want but I really wanted to do it in Postgres!

Googled for: https://www.google.com/search?q=postgres+group+by+two+columns+sort+by+a+third Read: https://stackoverflow.com/questions/70670433/postgresql-group-by-multiple-columns-and-sort-within-each-group 👀 And learned about the LAST_VALUE() Window Function https://www.postgresqltutorial.com/postgresql-window-function/postgresql-last_value-function/

This almost does what I want:

SELECT DISTINCT item_id, list_id, person_id,
  LAST_VALUE(position) OVER(
    PARTITION BY position
    ORDER BY position ASC
  ) pos,
  position
FROM list_items
WHERE person_id=2
ORDER BY pos ASC

dwyl-mvp-duplicate_list_items

There are still duplicates in the results even though I asked for SELECT DISTINCT item_id, list_id, person_id, ... It's the same result if we surround the DISTINCT line with brackets: DISTINCT (item_id, list_id, person_id):

SELECT DISTINCT (item_id, list_id, person_id),
  LAST_VALUE(position) OVER(
    PARTITION BY position
    ORDER BY position ASC
  ) pos,
  position,
  id
FROM list_items
WHERE person_id=2
ORDER BY id DESC

dwyl-mvp-duplicate_list_items-ORDER_BY-id-DESC

It's clear that the DISTINCT query is not doing anything here. So I'm going to remove it.

Along the way I used https://www.eversql.com/sql-syntax-check-validator/ to validate the query I was writing:

nelsonic commented 1 year ago

Ok. after going down an annoying rabbit hole in Postgres I've realised that I can't perform the full aggregation query in SQL because I still have to accumulate all the timers ... 🙄

https://github.com/dwyl/mvp/blob/4a04f8671edf329fd24cc87458a5ab86a110f1e0/lib/app/item.ex#L333

So I'm just going to write the sorting algo in Elixir and add it to the pipeline. 🧑‍💻 💭

nelsonic commented 1 year ago

Going to pick this up tomorrow morning when I have a fresh head. Way too many distractions & noise around me right now.

panoramix360 commented 1 year ago

hey @nelsonic, I saw the new DB update you are modeling.

Can you share what you want the Query to return? I can try to help, I have some experience with PostgreSQL.

Reading your messages I didn't get the full requirement for the query.

nelsonic commented 1 year ago

For context, the items_with_timers/1 function and related SQL query currently looks like this:

https://github.com/dwyl/mvp/blob/4a04f8671edf329fd24cc87458a5ab86a110f1e0/lib/app/item.ex#L205-L211

Running this query on a working version of the MVP on localhost that has a few timers on the items Results in the following table:

i.e. there are rows that look like duplicates but are in fact representing the multiple distinct timers. Then the accumulate_item_timers/1 function will consolidate the rows and sum the timers: https://github.com/dwyl/mvp/blob/4a04f8671edf329fd24cc87458a5ab86a110f1e0/lib/app/item.ex#L333C7-L376

What we want, in order to start using lists, is to modify the items_with_timers/1 SQL query to include a JOIN through list_items. But since we don't yet have the Interface Definition (AKA "UI/UX") for how to manage multiple lists, We're just fetching "All items" and displaying them in a single view.

I've got this figured out. I just don't get large blocks of time get my work done each day. ⏳ 😞

Getting into this now. 🧑‍💻

nelsonic commented 1 year ago

This is the SQL query I'm testing with:

 SELECT i.id, i.text, i.status, i.person_id, 
   t.start, t.stop, t.id as timer_id, 
   li.position as position, li.list_id
 FROM items i 
 FULL JOIN timers AS t ON t.item_id = i.id 
 JOIN list_items AS li ON li.item_id = i.id 
 WHERE i.person_id = 2 
 AND i.status IS NOT NULL 
 AND li.position != 999999.999
 ORDER BY li.position ASC;

Busy writing tests, docs and code to use this in lists and reordering #145

nelsonic commented 1 year ago

After the changes I've made I have 6 failing tests. 💔 Busy fixing them now to ensure that my update to the items_with_timers/1 query has not "broken" any existing functionality. 🧑‍💻 🤞

nelsonic commented 1 year ago

For complete clarity: the tests that are failing relate to Drag-and-Drop events added in #345 🐉 Not any of the core functionality. 👌 Don't worry. 😉

nelsonic commented 1 year ago

Minor update: list.text to list.name which IMO is clearer:

nelsonic commented 1 year ago

The list_items schema/table now has just fields:

list_id - the id of the list
person_id - the id of the person who updated the list_items record
seq - the sequence of item_ids e.g: "1, 2, 3, 42, 31, 26, 59" etc.

We use this seq will be used to find the items using a WHERE i.id IN ("1, 2, 3, 42, 31, 26, 59") query. I'm banking on the fact that this query will maintain the order. But if it doesn't we can easily just return the list.seq and perform the sorting on the client. 👌

Ref: https://www.reddit.com/r/PostgreSQL/comments/mlen8d/can_i_have_array_of_ids_in_where_statement/

nelsonic commented 1 year ago

ERD with item.cid before lists: