surrealdb / surrealdb.rs

SurrealDB driver for Rust
https://surrealdb.com
Apache License 2.0
64 stars 7 forks source link

Add `QueryResponse::get_key` function #15

Closed Aelto closed 1 year ago

Aelto commented 1 year ago

What is the motivation?

The current implementation of the QueryResponse type lacks the functions to retrieve a specific key from the selected rows without de-serializing the whole objects (which may represent a serious performance cost) or accessing the inner values manually (which isn't convenient)

Example scenario 1. We would like to get some user IDs:

let response = client.query("select id from User").await?;
let ids: Vec<String> = response.get(0, ..)?;

but this will result in an error: value: Error { kind: Deserialization, message: "invalid type: map, expected a string" }

Example scenario 2. We would like to get nodes from an edge:

let response = client.query("select ->write->Article as written_articles from User").await?;
let articles: Vec<Article> = response.get(0, 0)?;

but this will result in an error similar to the scenario 1.

What does this change do?

This PR adds the response.get_key(0, "id", ..)? method so one can retrieve one specific field from the selected rows.

let response = client.query("select ->write->Article as written_articles from User").await?;
let articles: Vec<Article> = response.get_key(0, "written_articles", 0)?;

I have decided to put the key parameter in second before the range in get_key(query, key, range) to represent the "get me the first query and its field key from the given range" formula like "select key from range". This can be changed obviously if you don't like it!

What is your testing strategy?

Unit tests were added to confirm the behaviour of the get_key function. This was done by first testing the original get function, and then adding two more tests for the get_key function itself.

_Note: I did not know where to place the tests so at the moment only the native-ws part has them. I also needed specific data for the tests but i feared breaking previous tests by inserting them, i then added a sandboxed_client function that returns a client with a unique ns/db pair so i'm sure that data is only accessible by these tests._

Is this related to any issues?

No.

rushmorem commented 1 year ago

@Aelto You may close this PR now. I have incorporated your idea into https://github.com/surrealdb/surrealdb/pull/1514. I hope you don't mind, but I took this opportunity to revamp, not only this part of the API but error handling as well. These were things already on my TODO list.

Now QueryResponse only has take, check and len.

let article: Option<Article> = response.get(0, 0)?;

becomes

let article: Option<Article> = response.take(0)?;

let articles: Vec<Article> = response.get(0, ..)?;

becomes

let articles: Vec<Article> = response.take(0)?;

let article: Option<Article> = response.get_key(0, "written_article", 0)?;

becomes

let article: Option<Article> = response.take("written_article")?;

let articles: Vec<Article> = response.get_key(0, "written_articles", ..)?;

becomes

let articles: Vec<Article> = response.take("written_articles")?;

etc.

response.take("written_articles")

is syntax sugar for

response.take((0, "written_articles"))
rushmorem commented 1 year ago

As always, thank you for the PR.

Aelto commented 1 year ago

@rushmorem sure i'll close it! Though it is interesting it became a take kind of function that mutates the inner vec rather that working over references. It means we can't retrieve a specific index from a response twice (for example the first and last element of a query) without deserializing the whole vec as calling twice: response.take(0) yields two different results now. Looking at it the owning of values is there just for the error handling in the from_value function too, one could argue cloning for errors is okay if it ensures working over references and not mutating things.

rushmorem commented 1 year ago

Hi @Aelto,

It means we can't retrieve a specific index from a response twice

Yes you can't retrieve the same index twice. Say you want the record(s) on index 0, calling response.take(0) the first time will return the record(s) of that query, if any. Calling it the second time will return None or an empty Vec depending on whether you are deserializing into an Option or a Vec. But judging by your next statement, I don't think this is what you meant.

for example the first and last element of a query

This is totally possible. The indices are stable. Taking one index doesn't affect the numbering of other indexes. If you have 4 queries, for example, you can take(0) and then take(3) just fine in any order you want. Stable indices were one of my design requirements.

without deserializing the whole vec as calling twice

This is not the case. Also, QueryResponse doesn't store a vec anymore. It now stores a hash map instead. When I first implemented this, I left it as a vec and then made it so taking the same index for the second time would return an error as the value at that index would have already been taken. I changed it to a hash map so I could remove from the map without affecting the indexes. I did this so we wouldn't pay a price in memory usage by having 2 representations of the same object in memory at the same time.

As you may have guessed by now, reducing memory usage is one of the reasons I switched from get to take. The other reason is that once Serializer and Deserializer are implemented, from_value will likely need an owned Value like its serde_json counterpart. If/when that happens, I would like to be able to get an owned object without calling clone on the reference. A mutable reference allows us to do that.