comunica / sparqlee

⚙️ SPARQL expression evaluator library - Moved to @comunica/expression-evaluator
https://www.npmjs.com/package/sparqlee
14 stars 7 forks source link

Suport overriding functions #87

Closed jitsedesmet closed 1 year ago

jitsedesmet commented 3 years ago

Just like we recently added support for extension functions #16 we should also add support for overriding existing/ default named functions.

jitsedesmet commented 3 years ago

I have an idea on how this needs to work:

To the config you provide a JS object with the mapping named function -> overloadArguments -> function implementation. We could add this overload to the OverloadTree of that function. Having this behavior would make it extremely easy for users using there own types to add overloads for the named function which are mostly/ all casting functions. This is really good because it would provide an easy way to add casting to your own types. We know this in many OOP languages as well with the almost basic toString function.

The caviat in this implementation is that right now the OverloadTrees are global. This behavior requires them to be linked/ a field in an evaluator. In other words, each evaluator would need his own set of OverloadTrees. I don't think it would be to bad to have this and I'm pretty certain we also need this for #109 .

Implementing this should not drop performance. I think of implementing a hybrid solution. We keep the global OverloadTrees since we don't want to copy them for every evaluator when it's not needed. I would only make a (deep) copy of the trees when overriding functions are provided to the config of the evaluator.

jitsedesmet commented 3 years ago

Another important note on this. This would mean all override functions are sync! At least as of right now.

rubensworks commented 3 years ago

As this should be seen as a nice-to-have feature, and will probably be used only very rarely, I would not overcomplicate things. I would suggest to just do the same as extensionFunctionCreator, and handle it similarly.

jitsedesmet commented 3 years ago

I think there's a big difference to be made. The extension functions had no overloading. Named functions however, do have overloading. For this overloading we need the OverloadTree.

If you're concerned about this issue getting really big or something, I don't think it will. I checked and I think the addition of these copies will be quite straight forward. Like I said, I think we need the copies for #109 . After we can make the copies, implementation of the additional overrides for named operators which is what this issue actually is, will be trivial.

jitsedesmet commented 3 years ago

So if you say that you like the access point I suggest. (Adding the object named function -> string[] -> function implementation) I will get this done pretty soon.

rubensworks commented 3 years ago

The extension functions had no overloading.

But they could have, right?

To avoid confusion, I would make sure extension functions and overrides are aligned.

If you think effort for all of this is low, go ahead :-)

jitsedesmet commented 3 years ago

Well, pretty soon, I would first need to resolve #109 because this would require open world types within the OverloadTree.

But they could have, right?

yes and no. The sparql specs clearly state extension function are in the form of RDF.Term[] => RDF.Term. A user could look at the type of these terms and implement some kind of overloading. He will not be able to use the OverloadTree we use of course.

I suggest a different approach for the default named operators. When we ask the user for an RDF.Term[] => RDF.Term function to override the default behavior the user will probably spend some time on trying to implement things sparqlee already can do:

  1. Get overloads to work,
  2. Implement the default behavior of the named function
  3. Implement the overload he/ she wants to add or change.

I suggest we don't make it that hard on the user. And just ask what overload he/ she wants to add/ change.

My suggestion would need to wait until after #109 . The suggestion with RDF.Term[] => RDF.Term could be implemented right now. What do you think? In the second suggestion extension functions and overrides will not be aligned.

rubensworks commented 3 years ago

The sparql specs clearly state extension function are in the form of RDF.Term[] => RDF.Term.

Are you sure? Where is this mentioned? AFAIK, the spec says nothing about how these functions should be configured in the query engine.

I suggest we don't make it that hard on the user. And just ask what overload he/ she wants to add/ change.

Fully agree. But the same reasoning can be applied to the extension functions (at least for points 1 and 3).

My suggestion would need to wait until after #109 .

👌

jitsedesmet commented 3 years ago

Well the specs don't say how to implement this, true. They do say: https://www.w3.org/TR/sparql11-query/#extensionFunctions

An extension function takes some number of RDF terms as arguments and returns an RDF term.

We could provide a builder that creates an overloaded function that also handles type substitution/ promotion. This builder could be asked for inside the evaluator (thinking about #109 for this already). I'm not really sure how I feel about exposing the builder from functions/Helpers.ts. This builder can be seen as a core part of sparqlee. Do we want to expose that?

Anyhow, all this about overloading types we don't know, will only get clear after implementing #109 . I think we should cycle back to the overloading of known named operators and extension function after that issue.

rubensworks commented 3 years ago

Anyhow, all this about overloading types we don't know, will only get clear after implementing #109 . I think we should cycle back to the overloading of known named operators and extension function after that issue.

Sounds good.

jitsedesmet commented 3 years ago

Update:

I talked with @rubensworks about the types we could use to provide an easy way of implementing overloaded functions. He had the idea of the following types:

type AsyncExtensionFunction = (args: RDF.Term[]) => Promise<RDF.Term>;
type AsyncExtensionFunctionOverloaded = { types: string[]; impl: AsyncExtensionFunction }[];
type AsyncExtensionFunctionCreator = (functionNamedNode: RDF.NamedNode) =>
AsyncExtensionFunction | AsyncExtensionFunctionOverloaded | undefined;

This is a backwards compatible solution that looks good and doesn't require us to give insights in sparqlees core.

We have a big issue when trying to implement this. It requires putting an async function in the OverloadTree. This is a problem that should be solved when we fix #113 since it's the same problem.

jitsedesmet commented 1 year ago

Now that https://github.com/comunica/sparqlee/issues/131 is closed, we will adapt to work more like the Comunica architecture. Our plan is to have every function/ oveloadtree be its own actor. Different actor implementations will result in the behaviour this issue wanted to accomplish.