Open berezovskyi opened 1 month ago
I think this is a great idea and moves from a generative to interpretive approach. Marshaling and unmarshaling do take time, but are often at the endpoints of GET and PUT. An interpretive approach moves these data transformations to incremental and per use. This can have performance implications that should be considered.
Jim, thank you. This is exactly the plan - to study (1) how far can we move along the generative-interpretive continuum in a statically typed language (C# in our case) and (2) where does the greatest developer benefit lie.
Having advanced code generators at our disposal (in .NET, we can use the Roslyn compiler bits to generate sources on the fly during development plus you can declare a class to be partial
) allows us to add Turtle files to the repo but access C# classes and interfaces with zero dynamic overhead at runtime. Same performance as if someone carefully hand-rolled that code.
On the other hand, having access to reflection, dynamic proxies, and even dynamic dispatch (Java has it too, e.g. for use from Groovy - although Groovy is not the fastest from the bunch) allows to go to the far ends of the interpretive galaxy depending on how much performance are we willing to trade. In general, the OSLC code I have seen so far was quite inefficient, using synchronous network calls, not keeping connections alive or setting thread pools correctly. When I wrote an async client in Kotlin for the RefImpl with quite modest 250 rps (though the server code was not migrated to async), an idea to give such modest power to Lyo users was met with some worry on what load would mean for the OSLC servers (providers). Thus, I am inclined to believe that in the OSLC space (and the broader space of enterprise integration) we have quite a bit of performance to trade off if the baseline is set very high (100k+ rps for servers like Kestrel; Eclipse Vert.x for Java/JVM easily starts off at 10k+) to trade off from there onwards.
Both Lyo and OSLC4NET (incorrectly) assume that every (OSLC) RDF resource has a "primary" RDF type, which has an associated shape. This allows the (un)marshaller to associate a POJO/POCO to this shape and (un)marshal the RDF resource onto a given class. All other
rdf:type
values are collected in an array.Of course, this is wrong and is an opportunistic reduction to fit a graph-based RDF model peg where properties do not belong to classes to the OO-world hole where properties belong to classes and dynamic multiple inheritance is not a thing. This impedance mismatch needs to be addressed to be able to work properly with the larger world of Linked Data applications outside OSLC.
One way to deal with this is not to use POCO/POJOs at all. Unfortunately, it only works well for languages that are not statically typed (good RDF libs for Ruby, JS, Elixir, Python; the story with
dynamic
in C# needs to be evaluated, see ExpandoObject in the std lib). Approaches where data is decoupled from logic work best (Prolog, Clojure). In statically typed languages, the boilerplate amount required is not insignificant and, most of all, using the RDF data without an abstraction often means losing the type safety. For C#/Java, we can try the following abstraction:IExtendedResource
.Additionally, the https://stackoverflow.com/questions/58453972/how-to-use-net-reflection-to-check-for-nullable-reference-type API in .NET 6+ allows to eliminate the https://github.com/OSLC/oslc4net/blob/main/OSLC4Net_SDK/OSLC4Net.Core/Attribute/OslcOccurs.cs attr from most declarations (i.e.
string
prop would beExactlyOne
,string?
ZeroOrOne,IReadOnlyCollection<string>?
would beZeroOrMany
butIReadOnlyCollection<string>
can be beOneOrMany
orZeroOrMany
- probably good to default toOneOrMany
but allowZeroOrMany
via an attribute).? https://en.wikipedia.org/wiki/Dominator_(graph_theory)