surrealdb / surrealdb.go

SurrealDB SDK for Golang
https://surrealdb.com
Apache License 2.0
231 stars 62 forks source link

Add SmartUnmarshalAll #79

Closed rytswd closed 12 months ago

rytswd commented 1 year ago

Fixes #78

Just to clarify -- I didn't follow the existing code of SmartUnmarshal on purpose, as it had so much if nesting and I couldn't really make sense of it... perhaps just because I'm not too versed with the actual data model from SurrealDB response. In this PR, I have tried to stick to idiomatic Go as much as I can. I'm open to updating the code for better clarity / performance πŸ‘

rytswd commented 1 year ago

I wrote the logic before realising that I could do SmartUnmarshal[[]User] -- perhaps that may be all I need without this explicit handling 🫠

EDIT:

Actually, I had some logic change in my code. The following code produces different output between surrealdb.SmartUnmarshal[[]User] and surrealdb.SmartUnmarshalAll[User]

    data, err = db.Query(`
        CREATE user:j SET name = "J";
        CREATE user:j SET name = "J"; // conflict
        CREATE user SET name = "John";
        `, nil)
    if err != nil {
        t.Fatal(err) // Not hit
    }
    y, err := surrealdb.SmartUnmarshal[[]User](data, nil)
    t.Logf("old data: %+v", y)   // old data: [{Username: Password: ID:user:j Name:J}]
    t.Logf("old err:  %+v", err) // old err:  <nil>

    x, err := surrealdb.SmartUnmarshalAll[User](data)
    t.Logf("new data: %+v", x)   // new data: [{Username: Password: ID:user:j Name:J} {Username: Password: ID:user:3owq0tv8kn0aiz1rc6od Name:John}]
    t.Logf("new err:  %+v", err) // new err: ERR: Database record `user:j` already exists
ElecTwix commented 1 year ago

Actually, I had some logic change in my code. The following code produces different output between surrealdb.SmartUnmarshal[[]User] and surrealdb.SmartUnmarshalAll[User]

edit: I reproduced the issue yeah this is a bug we expected to return one RawQuery but it returns as an array when bulk inserts I think there is also another PR needed to fix this.

    data, err = db.Query(`
        CREATE user:j SET name = "J";
        CREATE user:j SET name = "J"; // conflict
        CREATE user SET name = "John";
        `, nil)
y, err := surrealdb.SmartUnmarshal[[]User](data, nil)

SmartUnmarshal has been designed to use like tuples you can use like this too but you need to do this:

    data, err = db.Query(`
        CREATE user:j SET name = "J";
        CREATE user:j SET name = "J"; // conflict
        CREATE user SET name = "John";
        `, nil)
users, err := surrealdb.SmartUnmarshal[[]User](data, err)

I wrote the logic before realizing that I could do SmartUnmarshal[[]User] -- perhaps that may be all I need without this explicit handling 🫠

Thanks for your contribution. I had worked on SmartUnmarshal/SmartMarshal before Far as I see it has a similar way to solve the problem but I think a smaller func will be better still we can talk about this I'm just giving my opinion about it.

also if I skip or overlooked something please let me know thanks :)

rytswd commented 1 year ago

@ElecTwix Sorry for getting back to this late! There are several points I tackled in my code, and it's certainly worth clarifying all the points πŸ‘

Getting data from database

You seemed to have created a separate PR for defaulting to use array return like my implementation, and I personally think this makes more sense. It is far more likely to query bulk of data rather than one, and even if you specifically need a single data, it's simple to get the first item from the slice. On the other hand, if the return value is a single T, it may not be obvious to use []T as the type parameter.

Partial dataset is still useful

Because you can do multiple actions in a single Query, it would be beneficial to see what actually took place in the database. For example:

data, err := s.db.Query(`
        CREATE user_dummy:xxxx SET Username = "x", Password = "xxxx";
        CREATE user_dummy:y    SET Username = "y", Password = "y";
        CREATE user_dummy:xxxx SET Username = "x", Password = "INVALID"; // NOTE: Conflicting ID.
        `, nil)

(This is taken from the actual test code I added in this PR.)

The above is a perfectly valid way to interact with SurrealDB. When an error occurred, because this is not a transactional write, some data may be written, and some may have failed. I want to be able to see both data and err. The s.db.Query actually seems to only return err when some connection error happens, and in that case, I would want to retry if it's a transient issue for instance. If the error is about the SurrealQL, it's very possible that some data gets written correctly, and some not. I would want to unravel all the details with SmartUnmarshal[T], even if that means I may also get some partial dataset.

Also, there could be some cases where the return data is not conformant to the data type. For instance, when the schema assertion for some specific field is added after a data is created, such as:

CREATE user:x SET age = "one";
DEFINE FIELD age ON TABLE user TYPE number;
CREATE user:y SET age = 1;

Both entries user:x and user:y are valid in the database, but one is following the schema, and the other could be created before the field definition is defined. There are many scenarios like this to succeed receiving only a part of the data, and I'd like a clear and concise way to interact with SurrealDB, while giving me all the information I can gather, even partial dataset.

Other minor (but important) points

Taking err as a part of function looks odd

You mentioned about the tuple like argument -- I'm sorry but I found it extremely confusing to see the form of the following

result, err := surrealdb.SmartUnmarshal[T](s.db.Query("SELECT * FROM user"))

This may be a nice way of wrapping any error, but I want to see the error as soon as it occurred. When the function returns an error, I want to do if err != nil check, so that I can take some remediation actions against it. With the existing syntax, you could do the err check, but then you would be running a function like this

data, err := s.db.Query("SELECT * FROM user")
if err != nil {
    // some err handling
}
result, err := surrealdb.SmartUnmarshal[T](data, nil)

The second parameter nil seemingly comes from nowhere, and I'd have to actually go into the SDK implementation for what it does. It just doesn't look to be idiomatic Go code to me... πŸ€” While it is arguable what the idiomatic Go code should look like with the rise of generics, it was certainly baffling for me.

Function broken up to smaller chunks for easier review

Because of generics and complex data structure returned from SurrealDB, the code to make the SmartUnmarshal was quite difficult to wrap my head around. It may be just me, but I made smaller functions to make it easier to debug, and explicit about how data could have nested structure.

While understanding SurrealDB concepts and the actual data returned from the server is important, I would find it's beneficial to have a clear code which is easy to digest, which works as a live documentation.


Because what SmartUnmarshal does already and what it aims to do is quite involved, there are several points to be covered in this discussion. I'd be more than happy to discuss this more over a video call of some sort if that is easier ☺️

ElecTwix commented 1 year ago

Thanks for your detailed info about the issues you face and what the PR meant to achieve. I agree with most of the points you make the tuples are meant to make one-line code that can return the type that wanted but your point also makes sense for readability and

I believe both PRs need to stay in the code base for freedom of choice to use also, I issue you mention in #78 some of the issues you face with #90. still, I believe this PR needs the merged and eventually become one func for use

but also I am so concerned about performance outputs like you are using elements one by one it can be beneficial for readability but if are too big it can bottleneck.

rytswd commented 1 year ago

Thanks for the review, @ElecTwix!

I believe both PRs need to stay in the code base for freedom of choice to use

Thanks, I appreciate your understanding! πŸ₯° However, I may need to counter-argue for having multiple functions for doing the same thing. As SurrealDB itself is very much focused on DX, it would make sense to bring the same focus to SDKs as well. While the freedom of choice should be provided by ensuring code clarity and easy extensibility, it also makes sense to show what's considered "best practice" from the SurrealDB team. When there are several ways to do one thing, it can lead to unnecessary confusion.

I think we are both tackling the same problem in the existing code, and taking a slightly different approach right now. While I can understand both functions to live together for a short while, before the release is cut, it would be good to iron out and have a least amount of public facing functions. That may mean we could have two or more implementations in place, but the fewer the better. I can absolutely understand if this PR is to be dropped because of that, for the better DX of the future community πŸ‘

but also I am so concerned about performance outputs like you are using elements one by one it can be beneficial for readability but if are too big it can bottleneck.

This is surely a valid concern, and I haven't done too much investigation into this. We should certainly take more benchmark testing and profiling into account. My main focus was to provide an easy interface first, and iterate over to get the right performance. I'm completely open to suggestions, though ☺️

rytswd commented 1 year ago

@ElecTwix

I see your refactor change has got merged, and it closed this PR rather than associated issue of #78. Was this intentional? The use of err as a part of the argument is still in the new code, and it would be great if you could clarify what the way forward should look like πŸ™

ElecTwix commented 1 year ago

Hey @rytswd, I made a mistake by indicating "fix #79" instead of "fix #78". I will contact @timpratim to reopen the PR. I apologize for any confusion caused.

phughk commented 1 year ago

Fixed now @rytswd @ElecTwix thanks!

timpratim commented 12 months ago

Hi @rytswd can you please resolve the merge conflicts?

rytswd commented 12 months ago

Certainly, I'll do that in a bit!

rytswd commented 12 months ago

@timpratim @ElecTwix

Sorry, rebasing this is actually not that straightforward. There seems to be a massive design change following #90 - it seems the marshal logic is now in marshal.go, in a separate marshal package?

That seems really strange as a user point of view, where I'd have to import "marshal" package and write marshal.SmartUnmarshal[testUser](s.db.Select("users:notexists")). When I write in Go, my mental processing will go as follows:

There may be some rare occasions where you'd only interact with database without any marshalling / unmarshalling, but in most interactions would require either of the two, or even both. It is quite unintuitive IMHO how I'd have to pull in a dedicated "marshal" package for any database interaction.

I can take down this PR and create a fresh PR to suggest the update. I think the discussion point is drifting towards more of what SDK should expose to users rather than smart unmarshal behaviour.

ElecTwix commented 12 months ago

@rytswd

Sorry, rebasing this is actually not that straightforward. There seems to be a massive design change following #90 - it seems the marshal logic is now in marshal.go, in a separate marshal package?

Yeah, that's on me, for project structure it made more sense but I broke your PR in the process I'm sorry if you want I also could help to migrate your fresh PR.

That seems really strange as a user point of view, where I'd have to import "marshal" package and write marshal.SmartUnmarshal[testUser](s.db.Select("users:notexists")). When I write in Go, my mental processing will go as follows:

I understand your point but moved because of the project structure also if people want to marshal themselves self they also could and I don't think we should force it.

There may be some rare occasions where you'd only interact with database without any marshalling / unmarshalling, but in most interactions would require either of the two, or even both. It is quite unintuitive IMHO how I'd have to pull in a dedicated "marshal" package for any database interaction.

I agree that they are needed most but still modules will give code more clarity when the project grows.

I can take down this PR and create a fresh PR to suggest the update. I think the discussion point is drifting towards more of what SDK should expose to users rather than smart unmarshal behaviour.

I agree about this especially about the RPC module if we could expose it can be useful for users for their own marshaling or live queries

rytswd commented 12 months ago

@ElecTwix

Thank you for the input, but I'm not sure if I understand your intention clearly enough πŸ€”

I understand your point but moved because of the project structure also if people want to marshal themselves self they also could and I don't think we should force it.

I agree that they are needed most but still modules will give code more clarity when the project grows.

You mention these points, and the module / package structure need to be designed to accommodate growing requirements and ensure we would have limited breaking changes. But I may have to argue that I'm seeing the opposite with the recent changes. The new "marshal" package introduces breaking changes to the existing APIs, and I would personally assume the basic database interaction such as marshalling / unmarshalling would be provided by default -- asking users to write their own is quite strange to me.

I think I'm not seeing some rationale behind the breaking package changes, and somewhat unconventional Go coding style assumed for users. It may well be my ignorance and limited exposure with Go generics, but I'm quite puzzled anyhow.

Let me close this PR, and also write a separate issue when I have got the time. I haven't actually tested the new SmartUnmarshal logic introduced with #90 either, so I'll double check and update #78 accordingly. (Probably not this week but sometime in the near future.)