Open elizarov opened 2 years ago
can't all object
s be data object
s by default? It would be nice to have that string representation even for normal objects
Using the data
keyword for object
declarations seems inconsistent and misleading because the behavior differs from our mental model of data classes.
Inconsistencies:
copy
method is not generated since it doesn't make sense for singletonscomponentN
methods are not generatedequals
method is not generatedtoString
method doesn't provide the same type of utility that we get from data classes (see explanation below).hashCode
method is not generatedInconsistencies 2, 4, & 5 seem obvious at first but note that 99% of all objects have properties so adding the data
keyword would make most developers at the very least consider whether the keyword might bring this behavior of using these properties when adding the object into a hashing collection (eg. if the object extends some common base class) or in the generated toString or when attempting to destructure otherwise what's the point of adding a new keyword to objects if it lacks most of the behavior that developers associate with data classes.
I understand that data classes only reference constructor properties but object
declarations can never have this distinction as all of their properties must be defined inside the class. So the data
keyword hints at bringing these types of capabilities that data classes have and this creates confusion and invalid expectations since the data
keyword here does not provide the vast majority of behavior that we have come to expect from data classes.
3 & 5 are not really relevant as the generated version would be functionally equivalent to the default implementation. So their omission is actually a code size optimization with no behavioral difference. The functions are defined on all types so the presence of overrides is an implementation detail that no one needs to care about.
You could also argue that 2 does occur. A data object
generates the correct amount of componentN()
functions for the number of primary properties it wraps: zero.
That same argument applies to 4. The toString()
implementation behaves the same way as a data class
: it shows the type name and the values for all primary properties (whose count is zero).
So the only real difference is 1 and its omission is noted in the KEEP text.
note that 99% of all objects have properties
[citation needed]
I use objects to implement interfaces with stateless behavior and for marker types in sealed hierarchies. I would say 99% of my own objects never have properties.
I am sorry data object
is very confusing to me. I had a very hard time mapping my brain that an object
is a static singleton (which in itself a contrary to other languages where objects are referred to as dynamic). And now, data object
is even more confusing.
Instead, how about this approach we automatically determine whether its a data object or not, based on the return type?
data Message(String to, String from, String message)
object message = Message("abc@example.com", "bca@example.com", "Hello") // This now acts as "data object" as the return type is a data class.
User(String username, String password)
object user = User("KoltinUser", "S3CR3T") // Now this acts as a regular object.
Disclaimer - The proposed solution is based on my understanding of objects if I got it completely wrong or if it does not address other scenarios. Please ignore my proposed solution.
3 & 5 are not really relevant as the generated version would be functionally equivalent to the default implementation...
You could also argue that 2 does occur. A
data object
generates the correct amount ofcomponentN()
functions...That same argument applies to 4. The
toString()
implementation behaves the same way as adata class
...
You are correct from an implementation perspective if you base your decision on the underlying implementation details of data classes. However, this perspective doesn't seem aligned with the higher-level concept of data classes.
Instead of evaluating this solution of data objects by implementation details of data classes, I want to take a step back and evaluate the higher-level concept to ensure consistency and then use that to guide implementation details rather than the reverse approach. I also want to evaluate the solution based on core engineering principles of abstraction so that names aren't misleading but instead they are consistent and provide an accurate description of what they're supposed to represent.
From a concept perspective, a data class represents a container of data. The implementation details of data classes serve to meet the purpose of data classes rather than to define their purpose. For example, the implementation detail of excluding non-constructor properties from any of the auto-generated methods allows us to make the distinction of which fields should be considered to be part of the core data of the wrapper and thus only use those when checking for equality etc. therefore this shouldn't be used as part of the definition of what it means to be a data class since this is just a useful utility which makes data classes even more useful.
As a test for consistency, let's take an outside perspective from a hypothetical experienced developer that isn't familiar with Kotlin. Suppose that we explain the concept of data classes and that the object declaration creates a singleton. If we then ask the engineer what they would expect a data object to be then the most natural assumption for them would be to think that a data object is a singleton wrapper of data (eg. perhaps a singleton with related constants etc.). Now continuing with this expected interpretation and given that you can never declare properties in the object constructor, this engineer would expect the class properties to be the "data" of this data object. The concept of having the data keyword generate a fairly trivial single line of code seems to have too little value so this would further make engineers wonder if something more is taking place by having a data object.
This previous paragraph shows how points 2, 3, 4, and 5 are relevant based on the expected interpretation of someone that is not intimately aware of the underlying Kotlin implementation details and instead thinks in terms of the concepts that Kotlin provides.
For reference, the single line of code that I was referring to on the JVM is this:
override fun toString() = this::class.java.simpleName
Regarding exactly what percentage of object declarations have class properties, citations are probably non-existent and depend more on programming style. However, I gotta admit that my previous estimation was too hasty and high so instead of trying to put a number to it, here are some common use cases where I have properties in object declarations:
I'm sure I could add more scenarios but I think you get the idea that having val properties in an object declaration is not a rare occurrence by any means.
Lastly, the amount of value added compared to the increased complexity and confusion that it introduces seems to move Kotlin in the direction of Scala especially since it breaks the clean meaning of the data keyword.
can't all
object
s bedata object
s by default? It would be nice to have that string representation even for normal objects
Agree 😂
There's rare situation where one would like to see the default implementation of toString on objects. So personally making this default is better than adding more complexity to language itself.
I think someone mentioned this is done so that it is backward compatible if someone was relying on the existing string representation for whatever reason
If you are relying on a string representation like that - and if your app breaks on you - that is on you to fix. No need to keep luggage like this - this is not Javascript 😄
The reasons for introducing a separate data object
(as opposed to changing the way a regular object
behaves) go beyond backwards compatibility. First of all, we want to have a consistent way of declaring sealed class
hierarhies, where at the sub-classes and sub-objects are consistently marked with data
modifier. For example:
sealed class UserResult {
data class Found(val user: User) : FindUserResult()
data object NotFound : FindUserResult()
}
Also, this is just a stepping-stone in a progression of the future planned features. We do plan to introduce a more compact syntax for sealed class hierarchies for day(KT-47868 Concise syntax for defining sealed class inheritors), akin to enum class
syntax, that eschews most of the boilerplate code, so that the above declaration would be simplified to something like this:
sealed class UserResult { Found(val user: User), NotFound }
Here, the applicability of data
modifier to both objects and classes will make it easier to explain desugaring of this code into the more verbose version above.
Moreover, we do plan to work on a better approach to objects that are used only for the purpose of namespacing several declarations together (like kotlin.Delegates
objects). In the future, we plan to turn them into some kind of "static objects", so turning all such plain objects into "data objects" for the completely different reason seems like a wrong move.
I'm eagerly waiting for this feature to become stable. Points 3 and 5 are applicable for my use case - KT-40218.
There has been a massive improvement in the text of KEEP that now includes detailed information on the KEEP. The decisions around serialization have also been finalized (TL;DR: no special support for Java serialization, but it'll work fine with any kind of serialization thanks to the generated equals
and hashcode
). Please, see the updated text here: https://github.com/Kotlin/KEEP/blob/data-objects/proposals/data-objects.md
@elizarov Thanks for the update! So it won't fix KT-40218?
Specifically the following case:
sealed class Option<in T> : Serializable
class Some<T>(val value: T) : Option<T>()
object None : Option<Any>()
fun handleOption(option: Option<String>) = when (option) {
is Some -> "some ${option.value}"
None -> "none"
}
The code above crashes after deserialization, because None
has another instance.
@elizarov Thanks for the update! So it won't fix KT-40218?
It will be fixed if you use data object None
, because of the auto-generated equals
which is then used by when
expression.
Thanks, I thought when
uses reference equality for objects.
Can I ask for clarification about the difference between equals
(==
) between data and non-data objects?
I am focusing on jvm for this question, but I'm curious about other platforms as well.
Is the expected behavior the following?
equals
may return false for two references to the same object type because under the hood somehow two different instances were created or even two different types?equals
return true, even in these atypical scenarios.That is my interpretation of the KEEP, but I'd like to have a more precise understanding of the exact scenarios when I should expect equals
to return true for data and non-data objects.
but I'd like to have a more precise understanding of the exact scenarios when I should expect equals to return true for data and non-data objects
One of the use cases is Java serialization. After serialization you will have another instance of an object
, which is not equal to the original one (equals
returns false).
Another use case I noticed is that hashCode
of an object
returns different value on JavaScript after each page refresh. Whereas for data object
it's always the same.
Hello!
I like the concept of data objects but I am bothered with questions "What if I need to introduce a property in the future? How to ensure backward compatibility for my library users' code". Currently, I am adding function invoke, which returns data object and suggest to use it when working with data object.
Example:
sealed class Expr {
data object Cur : Expr() {
operator fun invoke() = this
}
data class Value(
val value: Number,
) : Expr()
data class Sum(
val expr1: Expr,
val expr2: Expr,
) : Expr()
}
fun test() {
println(Expr.Sum(Expr.Cur(), Expr.Value(1)))
}
In case new property is introduced in Expr.Cur, I don't need to change test function:
sealed class Expr {
data class Cur(
val context: Any? = null,
) : Expr()
data class Value(
val value: Number,
) : Expr()
data class Sum(
val expr1: Expr,
val expr2: Expr,
) : Expr()
}
fun test() {
println(Expr.Sum(Expr.Cur(), Expr.Value(1)))
}
But I don't like this solution because, firstly, you can forget to call invoke function. And, secondly, it is not suitable for Java users.
Maybe, you could help me to come up with more suitable solution.
Thank you.
This issue is for discussion of the proposal to add
data object
. The full text of the proposal is in this here.