Improve Output cell - Githubissues

LSViana commented 9 months ago

Improve issues mentioned at #44:

Add the Type property to the Output cell
Possibly remove nullable return types and check type before reading the value

LSViana commented 9 months ago

@SebastianStehle I pushed this PR to add the Output.Type property.

This partially solves your requests on #44 while we can't agree if it makes sense to remove the nullable return types from Output cells.

LSViana commented 9 months ago

This also relates to #48.

SebastianStehle commented 9 months ago

I like that you are working on that, but you have to make a decision how to move forward. if we both of us are working on the same stuff it only ends with doing the same things twice and frustration. Right now I am super confused, because you also create issues for things that are solved in my other PR. If you only want to accept small PRs or whatever it is up to you but I don't know what to expect right now.

LSViana commented 9 months ago

Hi, @SebastianStehle!

I'm splitting up the changes in your PR because, as I mentioned before, I want to make sure things are developed in a manageable way. It's not possible to move forward with good order with a PR that changes so much with so little discussion.

My main issue with your PR is how it changes the Output cell properties. I do agree with the addition of a Type property because it works well to only read the correct type through P/Invoke and avoid the overhead of extra native calls.

However, I don't like the idea of adding extra checks to throw exceptions when reading a property that's not the correct type. My approach isn't great because it lacks the Type property, but it was simple and had a very thin wrapper around yffi (which was my original intention with this library but I see you're open to do more "heavy lifting").

The approach you proposed builds on top of this thin wrapper but throws exceptions when reading properties that shouldn't be read based on Type. But it feels like a "mixed" approach that's not too thin but still very "raw".

So what if we go with a 3rd approach that could make us both happier and would be also more ergonomic for the clients of the library? It could be something like:

Keep the Type property to ensure we can know the type inside of a cell beforehand (it can be useful in certain scenarios);
Remove the properties to read different types (like String, Double, etc.)
Create a new abstract class Output<T> with a T Value getter property
Then we add methods like bool TryGetString(out Output<String> output) to read values from Output 4.1. We'd create internal implementations of Output<T> for each type, like OutputString and they'd implement the Value getter property with the exact code that's inside each property in the Output cell today

This way will be more aligned with the built-in .NET classes and we'll reduce the possibility of writing code that's valid during compilation but will inevitably fail during runtime.

Would this approach be a better solution from your point of view?

SebastianStehle commented 9 months ago

Which .NET classes do you use as a reference? I see a lot of parallels with stuff like System.Text.Json, see https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.Json/src/System/Text/Json/Document/JsonElement.cs.

I don't think that I am doing any lifting. There is no extra cost.

But the first question is: What is an output for you? For me it is more like a container than a value. Probably because rust does not have inheritance and it is easier to handle it like that. If it would be a value than the type system of yrs would probably look like this:

Type
   * Unit // Not sure about the right term here.
       * Number
       * ....
       * JsonArray
   * Branch
       * Map

The differentiation would be that a branch can have partial updates and a unit is only handled as an atomic value.

If you would build inheritance like in .NET, than a Map should probably also be an output? Or why are the several kind of containers? I am also not sure if you need the internal implementations of the output. But this has to do how you resolve values. To be on the safe side, I think you should resolve the values of an output when it is received. Otherwise you have to think about the lifetime of an output. Is it still valid to access the value of an output outside of the scope like an event or transaction? What happens if something has been disposed in the meantime in rust? I am still not 100% sure about that, but depending on that you either need a factory or subclasses.

As long as we do not use null for error handling and we can write small linq expressions I am fine. Therefore I don't want to have TryXXX only.

LSViana commented 9 months ago

Hi, @SebastianStehle!

About this:

Which .NET classes do you use as a reference? I see a lot of parallels with stuff like System.Text.Json (...)

I am taking those as references too but I must say I don't have large experience with those classes because I usually used them to get instances of the actual classes I needed.

But the first question is: What is an output for you? For me, it is more like a container than a value.

For me it's also a container, I never intended to use it as a value. It's very noticeable there's an intermediary layer when reading values out of Map or Array instances, for example.

As long as we do not use null for error handling and we can write small linq expressions I am fine. Therefore I don't want to have TryXXX only.

Got it! So what's preventing you from writing LINQ expressions using the approach that is a mix of having the Type property and nullable results with the Value property? This is the current status of the library.

If we can't really find an agreement, I believe I'm fine changing the Output cell to execute type checks before executing the native calls but I think it's unnecessary complexity for the library.

SebastianStehle commented 9 months ago

Got it! So what's preventing you from writing LINQ expressions using the approach that is a mix of having the Type property and nullable results with the Value property? This is the current status of the library.

If we can't really find an agreement, I believe I'm fine changing the Output cell to execute type checks before executing the native calls but I think it's unnecessary complexity for the library.

I can only repeat my argument, why I don't like null and I like exceptions.

Null is not an error, so we should not handle it like that.
It is very common in .NET to have a "exception path" and a "try path". e.g. int.Parse, int.TryParse. So when I trust my data I can use int.Parse and when it does not work in a very rare case out I get a meaningful exception.
I actually struggled with the problem in the tests. Because I got like 20 NullReferenceException in one of the test runs and I had no idea what is actually going on. I would not write productive code like this, but I have seen enough developers just adding "!" or "?" everywhere.
If I want to write good production tests I would also throw exceptions here if my assumption that something is a string is not correct. So it does not really make my code easier to use null.

I was thinking that we could also get rid of the Output at all. I think it is only a container that has no real value in .NET. But I am not sure if it would make the client code easier. Something like

class YValue {
   public YValue Value { get; }

class YSimpleValue<T> : YValue

class YLong : YSimpleValue<long>
class YDouble : YSimpleValue<double>

class YJsonArray : IReadonlyDictionary<string, YValue> {}

class Branch : YValue

We don't have to use the Y prefix, I just do not have a better idea for long and so on now. For json types this could also replace inputs but it does not work for immutable types like Map and Array, so it is probably not worth it.

References:

LSViana commented 8 months ago

Hi, @SebastianStehle!

Since #55 was merged and we settled on a new API for the Output cell, can we close this issue now?

y-crdt / ydotnet

Improve Output cell #52