Handle null values when deserializing series and structs for command expansion

cartermak commented 9 months ago

Tickets addressed: Fixes #1346
Review: By commit
Merge strategy: squash and merge

Description

Pass null values for series and structs to TS simulated activity instances for command expansion when deserializing simulation datasets.

Verification

Tested locally on one test case, but not tested extensively.

Documentation

None

Future work

Add the expected/allowed behavior of null values to the "Value Schema" documentation.

mattdailis commented 9 months ago

@cartermak thanks for looking into this!

I'm curious about implications for generated code - let's say that an Activity takes an Optional<Integer> parameter, which I imagine turns into a { "present": boolean, "value": number } type in the generated TS code

Does this mean that the expansion logic needs to handle the case where value is null? Should it also handle the case where present is null?

I think I'm a bit concerned about the implications of "any value can be null", since to me that means "downstream tools must handle any value being null" 🤔 This leads to a question of "is null a value, or does it represent the lack of a value?"

All that said, I'm in favor of pursuing your suggested fix in the short term, since I think the broader question of "what's the best way to represent Optionals in ValueSchemas" can be answered independently of making the sequencing server more robust

mattdailis commented 9 months ago

Ah, I realize that your specific examples have to do with struct and series - would we run into the same issue with an Optional<String>? I suspect we may want to handle null at the top, before splitting into the cases

cartermak commented 9 months ago

@mattdailis Great questions that I've pondered a bit already.

My current philosophical stance is that any field in a serialized value ought to be nullable (at any "level" of the value schema, if that makes sense, including "parent" structs/series). A few justifications I have for that position:

That's generally true in the way the corresponding structures are implemented in the model code (e.g., a complex nested record type in the mission model that maps to a nested set of struct in the value schema can generally be null).
If a given value is null, then that is either an expected possibility, in which case any logic which interacts with that value ought to handle the null, or it is unexpected and it's OK (even preferable) for anything that interacts with that value to break.
The UI can generally handle this already. The Clipper activities which cause the original errors in command expansion are displayed fine in the UI (the UI seems to just propagate blank values to each "child" in the value schema).

With that said, some specific answers based on that philosophy:

I'm curious about implications for generated code - let's say that an Activity takes an Optional parameter, which I imagine turns into a { "present": boolean, "value": number } type in the generated TS code

Does this mean that the expansion logic needs to handle the case where value is null? Should it also handle the case where present is null?

If it's meaningful that the value is null, then the expansion logic should handle that case. If it's not expected, then I think it's OK for the expansion logic to assume the value is not null and let any errors that may precipitate occur. I also don't think the onus is generally on the expansion logic to perform extensive error checking: the activity type/expansion rule interface ought to be a relatively "safe" one, and it already provides native mechanisms to check parameter values.

I think I'm a bit concerned about the implications of "any value can be null", since to me that means "downstream tools must handle any value being null" 🤔 This leads to a question of "is null a value, or does it represent the lack of a value?"

I haven't developed this thought as much, but I think there ought to be a distinction between a null value, which might occur for any number of programming-related reasons, versus some notion of an optional/"null represents the lack of a value," which should be more explicitly captured. For example, null can be a useful technical solution when parameters are coupled, as in the case of the present and value field's of Clipper's optionals. Alex and I tossed around the idea that the "right" answer might be for optional to get its own container type in the value schema definition alongside struct and series (with corresponding UI behavior, of course).

Ah, I realize that your specific examples have to do with struct and series - would we run into the same issue with an Optional? I suspect we may want to handle null at the top, before splitting into the cases

I don't think we'd run into the same issue with Optional<String> if only the value is null, but I haven't tested it. I believe the sequencing activity deserialization will already propagate a null value for any of the "primitive" value schema data types (i.e., not struct or series). If the whole optional struct were null, I think we would run into this issue.

mattdailis commented 9 months ago

I've opened https://github.com/NASA-AMMOS/aerie/discussions/1348 for philosophical discussion - I don't want it to get in the way of this PR

I don't think we'd run into the same issue with Optional if only the value is null, but I haven't tested it. I believe the sequencing activity deserialization will already propagate a null value for any of the "primitive" value schema data types (i.e., not struct or series). If the whole optional struct were null, I think we would run into this issue.

That's a fair point - I guess all the other cases are pass-through, yes? So they will simply return null if the value is null.

cartermak commented 9 months ago

I guess all the other cases are pass-through, yes? So they will simply return null if the value is null.

It seems so, with the exception of Duration -- I'm not sure how that library will behave if we try to parse null. These are probably good unit test cases!

cartermak commented 9 months ago

Oh, and it's maybe noteworthy that the variant type seems to already return null under certain conditions, although I'm not sure I understand the meaning of "VOID" or what information is being encoded by that keyword.

mattdailis commented 9 months ago

Oh, and it's maybe noteworthy that the variant type seems to already return null under certain conditions, although I'm not sure I understand the meaning of "VOID" or what information is being encoded by that keyword.

That's a vestige of Aerie's previous encoding of the "empty" type. This is the type used when an activity type does not return anything from its effect model - it still produces a "computed attributes" value, but that value contains no information.

"A type that contains no information" is called void in some languages and unit in others - the difference mostly comes down to functional composition (which is why you'll see "unit" show up in functional languages) Here's a nice-looking write-up from a Kotlin perspective

There are a couple different ways to encode a type that has only one possible value. One of them is to define an enum with only one variant - this is how Aerie first implemented the unitary type. After some confusion around this special enum with VOID as its only variant, Aerie switched over to using the empty struct to represent a unitary type (if you declare an empty struct value schema, there is exactly one value that satisfies that schema*). This was the PR that made the switch: https://github.com/NASA-AMMOS/aerie/pull/311

I think that makes the code you're reading obsolete... there is of course the chance that some existing Aerie deployment still has vestigial VOID values, but I think it's a pretty low likelihood.

if you declare an empty struct value schema, there is exactly one value that satisfies that schema*

*Side note - if every value can be null, then that means there are exactly two values that satisfy this schema, making it no longer particularly "unitary"

cartermak commented 9 months ago

@goetzrrGit I haven't tested so I'm not sure how it would behave, but I expect this would be equivalent for our use case (we don't actually expand any of the activities with this issue right now; Aerie just deserializes all activities regardless of whether there's an expansion rule). That said, is there a reason to change the value from null to an empty array/object? I'd posit that the null value could have meaning, and in the case of the empty object, the structure is still not preserved.

goetzrrGit commented 9 months ago

@cartermak Early in my career, I was heavily influenced by senior developers who strongly opposed the use of null. They were very adamant about avoiding it.

In situations where we're deserializing data into an array or objects, I believe an empty form serves as a better alternative to null because we have explicit knowledge of the expected data type we are trying to generate. This doesn't necessarily mean I disagree with the earlier discussion about the meaning and potential value of null I just think in this case we know what datatype we are after ([] or {}), and if one wasn't provided, return an empty container.

cartermak commented 9 months ago

@goetzrrGit That's fair; in this case, I'd argue that avoiding usage of null would need to be solved upstream in the mission model and the best behavior of the deserialization logic would be to preserve the value instead of mutating to another empty container, but we can revisit the solution when we have time to apply a more holistic philosophy to null in the Aerie value schema/mappers and Clipper's mission model.

For now, this is still a high-priority blocking issue and I'm happy with any solution that doesn't break when running command expansion on a simulation dataset with these null values.

parkerabercrombie commented 9 months ago

In general, I agree with @goetzrrGit that an empty object is a cleaner "no data" value for a list or map. But in this case we are blocked on using our mission model, I would vote for an expedient fix (and maybe move to something more elegant in future).

goetzrrGit commented 9 months ago

@parkerabercrombie and @cartermak Dan, Matt, and I will have an internal meeting today to discuss this further. For the most part, we don't want to block you so we are leaning toward using null for now.

cartermak commented 9 months ago

@goetzrrGit thanks for the update. To be clear, the design solution won't impact us either way -- we don't expand any of these problem activities; the problem is just that we can't expand other activities if these problem activities are in the plan. We'll take whatever is fastest and most palatable to implement right now.

dandelany commented 8 months ago

Discussed this with @goetzrrGit and @mattdailis today and I think we agree this is a good change for the sake of consistency until we figure out long-term null handling. A few things still outstanding:

we have to open a new PR since, unfortunately, our automated checks/tests fail to run when opened from a fork (instead of an internal branch)
We decided it's best if all types are consistent in how null is handled - ie. they should all pass null through, not just Series and Struct.

@goetzrrGit will open a new PR with these changes.

cartermak commented 8 months ago

@dandelany Thank you for the update! That all sounds good to me.

NASA-AMMOS / aerie