dyedgreen / deno-sqlite

Deno SQLite module
https://deno.land/x/sqlite
MIT License
409 stars 36 forks source link

Provide example for getting array of objects from SELECT #71

Closed halvardssm closed 4 years ago

halvardssm commented 4 years ago

Hi! I am trying to use this library for a side project, and I am having some problems using a select query and map column names to the rows so that the result is an array of objects. Could you provide an example for this? 😊

dyedgreen commented 4 years ago

Hey, thanks for opening an issue! I had a quick go at it and this is what I came up with:

import { DB } from "https://deno.land/x/sqlite/mod.ts";

const db = new DB();

db.query("CREATE TABLE people (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, surname TEXT, email TEXT)");
db.query("INSERT INTO people (name, surname, email) VALUES (?, ?, ?)", ["Peter", "Parker", "peter.parker@deno.land"]);
db.query("INSERT INTO people (name, surname, email) VALUES (?, ?, ?)", ["Clark", "Kent", "clark.kent@deno.land"]);

const q = db.query("SELECT * from people");
const cols = q.columns();
const rows = [...q].map(row => {
  const res: any = {};
  for (let i = 0; i < row.length; i ++)
    res[cols[i].name] = row[i];
  return res;
});

console.log(rows);

I would like to add something like this to the examples in the docs, but I feel like there should be a slightly more elegant way to do this, before putting this into the notes.

halvardssm commented 4 years ago

@dyedgreen Thanks for getting back to me! I realized what was the issue. When the database is empty, the q.columns() throws SqliteError: Unable to retrieve column names as transaction is finalized., and the same if const rows = [...q] is done before const cols = q.columns(); (which makes sense, but is necessary to check if there are any results to avoid the throw). Could we add a default to this? Something like columns() always works even if the database is empty or there are no results?

dyedgreen commented 4 years ago

I see. I'm not sure what would be a good way to proceed here, what would you expect .colums() to return? If you want to safely do this operation, you could modify the code above like this:

import { DB, Empty } from "./mod.ts";

const db = new DB();

db.query("CREATE TABLE people (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, surname TEXT, email TEXT)");
db.query("INSERT INTO people (name, surname, email) VALUES (?, ?, ?)", ["Peter", "Parker", "peter.parker@deno.land"]);
db.query("INSERT INTO people (name, surname, email) VALUES (?, ?, ?)", ["Clark", "Kent", "clark.kent@deno.land"]);

db.query("CREATE TABLE empty (id INTEGER PRIMARY KEY AUTOINCREMENT)");

interface Named {
  name: string;
}

interface Query extends Iterator<any[]> {
  columns(): Named[];
}

class RowIter {
  private query: Iterator<any[]>;
  private cols: Named[];

  constructor(query: Query) {
    this.query = query;
    this.cols = [];
    if (query !== Empty)
      this.cols = query.columns();
  }

  next(): IteratorResult<any> {
    const { done, value } = this.query.next();
    if (done)
      return { done: true, value: undefined };
    const result: any = {};
    for (let i = 0; i < value.length; i ++) {
      result[this.cols[i].name] = value[i];
    }
    return { done: false, value: result };
  }

  [Symbol.iterator]() {
    return this;
  }
}

console.log([...new RowIter(db.query("SELECT * from people"))]);
console.log([...new RowIter(db.query("SELECT * from empty"))]);

I feel like this sort of thing may make sense as part of a small utility module (the above is a bit of a pain to write yourself)? Maybe we could add a utils.ts package, which provides a few nice tiny wrappers like this. (perhaps initially with weaker stability guarantees)

halvardssm commented 4 years ago

I would expect columns() to return the requested columns even if the result is empty. I was just wondering, is there a reason for why we have Empty?

A utils.ts would certainly be helpful, though I would argue that to get the result as an array of objects should be integrated into Rows like they do in deno-postgres:

  ...
  const result = await client.query("SELECT * FROM people;");
  console.log(result.rowsOfObjects()); // [{...},...]
  ...

What would you prefer, have a separate utils file, or integrate it into Rows as a method?

dyedgreen commented 4 years ago

I was just wondering, is there a reason for why we have Empty?

That's a good question; basically, the main problem is that we can't hook directly into the JS garbage collection (i.e. there is no way to observe a garbage collection on an Rows object). But at the same time, I want to support large tables, which means we return iterators over rows (so that there is no need to keep all rows in memory simultaneously, and the cost of running a large select is not payed up-front, but incrementally, with the option to cancel using rows.done()).

This means that we need to give the user a small about of memory management responsibility: They need to call .done() on any query which is not fully used e.g. if I don't fully iterate a row, I need to call done so do something like:

const rows = db.query("SELECT id, name FROM people");
for (const [id, name] of rows) {
  if (name === someComputedValue(id)) {
    doSomething(id, name);
    rows.done(); // NOT break
  }
}

Most of the time, the user does not care/ need to do this though. In those cases they should not need to worry about this detail. To make this possible, when you run a query, the statements reference you used is automatically finalized for you. But this also means that we can no longer run e.g. the columns() introspection.

Currently, calling columns() performs this introspection upon request, to achieve the behavior you want, we'd need to perform this introspection for every query that is run, which is potentially expensive.

Generally there are two cases where this is interesting:

  1. You want to transform the returned rows in some way: in this case you need to include the empty check
  2. You want to perform introspection on the database: in this case you should use PRAGMA table_info("people")

A compromise would be to return an empty array when the statement is finalized, but I feel like this may lead to confusion when someone tries to use this facility to introspect an empty select statement. What would you think of this option?

dyedgreen commented 4 years ago

Regarding utils.ts vs having a .asObjects() or similar call, what would be your use-case? I do agree, that it's nicer to include the api as a first-party method call on the rows object, but doing the introspection and constructing objects is quite expensive.

That doesn't mean it shouldn't exist at all though, but I would be curios what your use-case is, and if there are any reasons you are not happy with using something like

for (const [name, email] of db.query("SELECT name, email FROM people")) {
  doSomething(name, email);
  // or ...
  doSomethingElse({name, email});
}
halvardssm commented 4 years ago

I see, thank you for clearing this up for me, and after hearing the reason I agree with the way the query works and how you implemented it. The only thing I wonder about, is this introspection really that expensive? I can't imagine that the user will notice this introspection unless they are sending a large number of queries in a very short amount of time. For the general (though I might be biased), I don't think this will have any impact, but please correct me if I am wrong (I am not criticizing, I simply want to help with this module).

I think how you have done the iterator is perfectly fine as it returns exactly what it is supposed to, and I agree that to return an empty array might lead to confusion, so I would just keep it as is. My opinion is that we should do the introspection on each request, and provide an option that could be added as a third or integrated with the second argument, which allows the user to disable the introspection (or doing the reverse and have an option to enable introspection).

The use-case would be an API where the database is queried on each call, e.g. a GET request to list all books. In this case I would want to return the information without having to transform the result and could simply do response.body = [...db.query("SELECT * FROM books").toObjectArray()] which simply places the result in the body and it would result in something like [{title: "book1", ...},...]. This way it will be up to me as the developer to decide if I want the expensive transformation, and if I want it to be speed optimized, I can omit .toObjectArray() which will not transform the result and I can use the example you showed me.

For more complicated API's where the user can request a variable amount of columns of a larger table and include filtering etc, it is not scaleable to do something like for (const [name, email] of db.query(...)) as you can not determine upfront what will be requested.

So to sum it up, I think we should include a method call in Rows to transform the request to an array of objects, and then let the user decide whether or not to use it. An option object should also be introduced in the query method and should let the user disable (or enable) the introspection for each call. I think by introducing these two features, the problem will be solved, and we don't have to make any compromises.

What do you think?

dyedgreen commented 4 years ago

Yeah, I think that makes sense. Adding an asObjects or similarly named function makes the most sense to me I think.

I don't want to run the columns introspection on every call by default, since it requires copying strings back and forth between Js/ C lands, which is an expensive operation, especially for someone who e.g. just wanted to write a few numbers.

I'll look into building an api like this when I have time this week (most likely towards the weekend), feel free to also submit a PR if you feel like building it yourself 😄

halvardssm commented 4 years ago

I'll see if I can submit a PR with the asObject method, should have some time at some point this week 😊