pocketbase / pocketbase

Open Source realtime backend in 1 file
https://pocketbase.io
MIT License
39.22k stars 1.81k forks source link

Nullable and non-nullable filelds #122

Closed ganigeorgiev closed 2 years ago

ganigeorgiev commented 2 years ago

Currently the null handling behavior is not very clear - with the exception of bool and multiple value fields, all other fields support storing and returning null, if the field is not marked as "Required".

We could improve and normalize this behavior by introducing a new "Nullable" boolean field option, allowing developers to specify whether they want null or zero-values in their api responses. In general, we want to support the following field return values:

Field Nullable response values Non-nullable response values
Text null, "", "test" "", "test"
Number null, 0, -1, 1,1.5 0, -1, 1, 1.5
Bool null, false, true false, true
Email null, "", "test@example.com" "", "test@example.com"
Url null, "", "https://example.com" "", "https://example.com"
Date null, "", "2022-01-01 00:00:00.000" "", "2022-01-01 00:00:00.000"
Select (single) null, "", "optionA" "", "optionA"
Select (multiple) null, ["optionA", "optionB"] [], ["optionA", "optionB"]
File (single) null, "12i...ZjL.png" "", "12i...ZjL.png"
File (multiple) null, ["12i...ZjL.png", "34i...ZjL.txt"] [], ["12i...ZjL.png", "34i...ZjL.txt"]
Relation (single) null, "74kmig0HEyR7uuD" "", "74kmig0HEyR7uuD"
Relation (multiple) null, ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"] [], ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"]
User (single) null, "74kmig0HEyR7uuD" "", "74kmig0HEyR7uuD"
User (multiple) null, ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"] [], ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"]
* JSON any json value any json value != null

Since PocketBase supports both application/json and multipart/form-data content-type requests, user will be allowed to send both null or the respective zero value and if the field is marked as "Non-Nullable", then we'll just cast the value to ensure that always the zero-value is saved and returned. For example, if we have a "Non-Nullable Number" field and the user sent null in the request, we'll cast it to 0. _Exception to this rule will be only the JSON field, because the JSON field doesn't have a clear non-null zero-value, and in this case the non-nullable field option will act as a validator and will prevent user to submit null as value._

To avoid introducing breaking changes in existing data and applications, we'll have to add a DB migration that will set "Nullable" totrue/false to each collection schema field in order to preserve the current behavior.

For newly added fields, the Nullable field option will be false by default.

We also need to update the Admin UI to support setting null as a value when creating a new record from the dashboard.

This feature is with high priority and it is planned to be implemented as part of v0.3.0 release.

Feedback and suggestions for different implemention/rules are welcomed.

Lex-2008 commented 2 years ago

I don't have a strong opinion in this case, but I agree with the discussions thread https://github.com/pocketbase/pocketbase/discussions/74 which mentioned that we, users, appreciate that pocketbase manages to remain both simple and powerful. I'm not sure that adding "nullable" field option in addition to the existing "required" option is the right step in this direction.

Regarding the table above - I might be mixing "nullable" with "required", but for cases of "Email", "URL", "Date", and single-option "Select", I don't see a good reason to differentiate between null and "". For example, if we consider a field of URL type, it can be either a valid URL (string starting with http), or (if the field not required) not set. That makes sense, but having a distinction between "URL not set" (null) and "URL is set but empty" ("")? I don't see a valid usecase for that.

But I might be a bit biased here, since I just got hit by an issue where a JSON-typed field (user-defined list of tags) can be either null or [] - both of them show as "N/A" in admin dashboard, but in Javascript only one can be detected with simple if(record.field) check (empty array considered "truthy" here, null is "falsy"), and another one - with simple if(record.field.length) check (null.length throws an exception here), so I have to write if(record.field && record.field.length) to check if a record actually has tags. But sorry, that was my morning rant :-)

ganigeorgiev commented 2 years ago

@Lex-2008 Thanks for the feedback.

Yesterday I started working on a poc to see how it will look and indeed this leads to some complications (both in the source and in the UI) that may not be worth.

The reason why I suggested adding Nullable option in the first place is because currently the behavior is mixed and changing it may break user land code if someone rely on the null value (currently all fields can be set to null with the exception of bool and array values).

But more that I think on it, maybe it will be better to just normalize the caster and introduce the small breaking change while we can and always return non-null value (with the exception of the JSON field). This would also simplify the conditional checks pointed by @Lex-2008 and in general it will be easier to integrate with languages that has sound null safety (eg. Dart).

I'll leave the issue open for a couple more days, to give chance other users to share their feedback, but for now I agree with @Lex-2008 that we may not need the null value and if no one disagree I'll proceed with normalizing the caster and always return the non-null value. The supported values table will look like this:

Field Supported value
Text "", "test"
Number 0, -1, 1, 1.5
Bool false, true
Email "", "test@example.com"
Url "", "https://example.com"
Date "", "2022-01-01 00:00:00.000"
Select (single) "", "optionA"
Select (multiple) [], ["optionA", "optionB"]
File (single) "", "12i...ZjL.png"
File (multiple) [], ["12i...ZjL.png", "34i...ZjL.txt"]
Relation (single) "", "74kmig0HEyR7uuD"
Relation (multiple) [], ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"]
User (single) "", "74kmig0HEyR7uuD"
User (multiple) [], ["74kmig0HEyR7uuD", "FtHAW9feB5rze7D"]
* JSON any json value
ganigeorgiev commented 2 years ago

Initial implementation was released in v0.3.0.

I think using only a non-nullable value will improve and simplify the overall dev experience. At the end we can always go back to have a Nullable option if we find that the current behavior is lacking.

For now, I'll close the issue, but I'm open for discussion so feel free to share your feedback.

yawaramin commented 1 year ago

I understand this has been incorporated for a while now, but I'd like to register a vote against the decision to use Go's concept of 'zero values' for SQL data. In SQL the null value is super important (despite what database purists may feel), e.g. we could have a table of metrics where some metrics are unknown and some are actually 0. If they're all set to 0, this will lead to quite difficult data reliability issues.

Ultimately as much as it makes life more difficult in some ways, in SQL we do need the concept 'unknown value'.

ganigeorgiev commented 1 year ago

@yawaramin I agree that null could have a meaning depending on the context, but it complicates a lot of other things:

It is just a lot easier to work with the fields if you know that they only have 1 type representation.

As mentioned in the above comments, depending on the demand we can eventually introduce a "Nullable" field option, but there are too many other tasks with higher priority and I'm already behind the schedule for my other open source project (Presentator #183), so for now this remains out of the scope.

The simplest workarounds to represent unknown value are:

yawaramin commented 1 year ago

Thanks for explaining. Let me try to address a couple of points here but first let me emphasize that I am just trying to address the technical points, not trying to get you to prioritize this issue. If it happens in the future, great. Meanwhile as you mentioned there are workarounds or maybe someone else will take it up as a challenge.

Anyway:

your api rules now need to check null in addition to the zero-value...makes it harder to integrate with client-side languages that are non-nullable by default

These kinds of languages typically have fantastic support for checking and converting null into type-safe values, e.g. 'Option' types which are checked by the compiler.

since everything is a string when sending multipart/form-data requests

I am guessing you mean application/x-www-form-urlencoded which is what HTML forms will typically submit? multipart/form-data is used when uploading a file, right?

we now have to handle "null" as special keyword

Alternatively clients could simply omit sending the value if it is null? E.g. in an HTML form you can do <input name="tempCelsius" disabled> where you could set the disabled attribute during the form submit event.

(similar normalizations are done for the json and it is not pretty)

JSON is defined as supporting null: https://www.rfc-editor.org/rfc/rfc8259.txt ('JSON can represent four primitive types (strings, numbers, booleans, and null) and two structured types (objects and arrays).')

But we could also use Go's encoding/json option omitempty to simply omit encoding nil values when encoding the model to JSON: https://pkg.go.dev/encoding/json#Marshal

That way on the client side the validation would check whether the field exists or not in the JSON object. This is typically super easy.

The issue with 'Nonempty' is that it's a Go implementation detail. The concept of 'nonempty' has very Go-specific semantics and people will repeatedly be surprised by its behaviour. In contrast, 'null' semantics are much more widely understood across languages and everybody more or less agrees on them.

ganigeorgiev commented 1 year ago

These kinds of languages typically have fantastic support for checking and converting null into type-safe values, e.g. 'Option' types which are checked by the compiler.

I'm not sure that I understand what do you mean. The collection API rules has nothing to do with the client-side languages. What I meant was that in the Admin UI, when a field is nullable you'll have to check for both null and the zero-value in order to handle all cases, eg:

@request.data.someField != null && @request.data.someField != ""

I am guessing you mean application/x-www-form-urlencoded which is what HTML forms will typically submit? multipart/form-data is used when uploading a file, right?

No. You can submit record data via application/json and multipart/form-data requests. File upload is supported only via the latter, but once again that doesn't mean that you can't submit regular fields alongside the uploaded file as part of the FormData object.

Alternatively clients could simply omit sending the value if it is null?

This will not work because you need a way to clear/unset the previously submitted value. Also keep in mind that the update operation is a PATCH, aka. only the submitted fields are updated. We can't use an empty field value because as mentioned when working with multipart/form-data everything is a string and you need a way to differentiate empty string from null.

Go's encoding/json option omitempty

I don't understand what go encoding/json and omitempty has to do with the null field values. By "json" earlier I meant the json field type where we apply the following normalizations if the value is a string - https://github.com/pocketbase/pocketbase/issues/1703#issuecomment-1407628966.

The concept of 'nonempty' has very Go-specific semantics and people will repeatedly be surprised by its behaviour.

This is not only Go-specific. As mentioned a lot of other client-side languages has non-nullable behavior by default (eg. Dart, Kotlin). Yes, you can also have null-able values, but that's the same as in go when specifying the type as pointer. The main issue is that you no longer can access or operate directly with the value because now you have to check for both null and the actual type (aka. you can't just do someField + 123).


Once again, we can always introduce "Nullable" as fields option, but for now there are other priorities and it is out of the scope of v1 meaning that even if you decide to work on it, it may not be merged (please refer to https://github.com/pocketbase/pocketbase#contributing).

yawaramin commented 1 year ago

The collection API rules has nothing to do with the client-side languages....What I meant was that in the Admin UI, when a field is nullable you'll have to check for both null and the zero-value in order to handle all cases

Ah, I had misunderstood, you are right that nullables would require more checks (in the cases where checks were being done). It might be possible to add a postfix operator ? that would short-circuit to false if the value was null (like latest EcmaScript). E.g., instead of your example, we might be able to do: @request.data.someField? != "".

you need a way to clear/unset the previously submitted value....Also keep in mind that the update operation is a PATCH, aka. only the submitted fields are updated.

So I looked into the semantics of PUT and PATCH and in both cases it would be possible to clear/unset existing field values.

With PUT it's simpler, the object which is sent in the request body is the exact object to be persisted. Any fields not sent in the body are nullified.

With PATCH it's a bit more complex. The way it's supposed to work is, the request body is supposed to be a set of instructions on how to update the existing object. So if we expressed it as JSON it would look something like:

{
  "patches": [
    {"field": "name", "action": "set", "value": "some new value"},
    {"field": "someField", "action": "nullify"}
  ]
}

By "json" earlier I meant the json field type

Ah OK, I thought you were talking about the HTTP API's JSON response. I misunderstood.

This is not only Go-specific. As mentioned a lot of other client-side languages has non-nullable behavior by default (eg. Dart, Kotlin)

Sorry, I think I did not express this clearly enough. I meant that the concept of 'nonempty' as presented in the PocketBase UI, has a specific semantics which is dictated by the Go language, i.e. 'nonempty' here means specifically 'cannot be the zero value of this type as defined by Go'. This is the behaviour that I think will surprise people. Yes, Dart, Kotlin, and other languages have non-nullable types; but they don't have the 'zero value' concept i.e. the in-band value which in Go means 'this is either unknown or known to be this value'.

there are other priorities and it is out of the scope of v1 meaning that even if you decide to work on it, it may not be merged

Understood, I don't want to force the issue, I just want to capture my thoughts on it, in case anyone comes back to this later on.

miguelcobain commented 8 months ago

I would just like to add some thoughts to this issue.

Purely from a database theory perspective, to represent absence of information, you do need nullable fields. If you don't have that option, then the only other option is an additional boolean field that represents the presence/absence of another field. If column A is true, then read column B. If false, then ignore column B (the "B is null" case).

IMO, while not having nullable fields might make the case where the absence/presence of information isn't important a lot easier, it does make the case where that is important significantly more difficult. In many cases (in my experience, at least more than 50% of them?), devs have to deal with optional fields. Furthermore, it seems to me that this pushes users into a potential pitfall in which they are encouraged to treat "", 0 or -1 as "no information". "", 0 and -1 are very different things from no information at all.

Also, if nullable fields are optional, it seems to me that we would be getting the best of both worlds and use nullable fields where it actually makes sense and not use them where it doesn't.

I consider representing the absence of information a critical feature of a database. I would love if PocketBase had optional nullable columns/fields.

scpedicini commented 4 months ago

@miguelcobain Strongly agreed. Just want to also chime in that "" effectively being equivalent to null tripped me up several times while designing an app with Pocketbase particularly since it emphasizes that the backing DB is SQLite which supports null.

Having to arbitrarily assign a MAGIC CONSTANT value to represent a "lack of information" in a column is kludgy at best, and impossible at worst - e.g. if all floating point values have meaning in a column, you're out of luck. The only workaround is to create an unnecessary additional boolean column isNumFilled to indicate if the data in the numerical column is valid.

step135 commented 3 months ago

@scpedicini This issue is closed and nobody from the team wants to implement NULL. It was their decision the prohibit it.

So the solution is simple. Don't use the Pocketbase anymore and just choose any other database. You will be lucky as all of them have never prohibited using NULL. And never ever come back to the Pocketbase.

ganigeorgiev commented 3 months ago

@step135 While I appreciate constructive criticism, this is not one of it.

There is no team behind PocketBase, it is just me, and I work on the project on volunteer basis.

I understand that having null as option to represent absence of data could simplify some use cases but it also will introduce a whole other set of problems (it could be a major security footgun with the API rules).

For now I'm not going to work on adding null option for the reasons explained in my earlier comments, but basic support for externally defined fields will be added with the refactoring which should allow users to "plug" their own implementation if they really need it.

With that said, I'm not trying to convince you to use PocketBase (other than bug reports and feedback I have no benefit from users using it) and I'm not planning growing it beyond what I can maintain on my own.

Abstractions like PocketBase will always come with some limitations/"lock-in" and users need to carefully evaluate the pros and cons of integrating it into their stack. If the lack of null or some other SQLite feature is such a deal breaker for your use case, then Yes I agree you might be better with using SQLite directly or maybe try some of the other more mature OSS solutions like Supabase, Nhost, Appwrite, SurrealDB, EdgeDB, Payload, Directus, Strapi, etc.

yawaramin commented 3 months ago

I think as long as this is documented somewhere prominently so it doesn't take anyone by surprise (eg PocketBase does not have the concept of null, all values are non-null defaulting to the 'empty' value of the type), it's perfectly reasonable to let users choose for themselves.