comunica / sparqlee

⚙️ SPARQL expression evaluator library - Moved to @comunica/expression-evaluator
https://www.npmjs.com/package/sparqlee
14 stars 7 forks source link

AgragateEvaluator uses SyncEvaluator #86

Closed jitsedesmet closed 3 years ago

jitsedesmet commented 3 years ago

@wschella I was wondering why AgragateEvaluator uses SyncEvaluatorConfig. I was trying to make sure the custom-function implementation for #16 was usable in a bigger project like comunica. Within this project almost everything is async, so it wouldn't make any sense to ask a developer for sync extension functions. However, the project uses sparqlee's AggragateEvaloator which is build apon the SyncEvaluator. Maybe we should make an AsyncAggragateEvaloator for completeness?

wschella commented 3 years ago

I think this was originally done because expressions in an aggregate expressions are a lot simpler, e.g. you can not nest EXISTS in there, i.e. only RegularFunctions, not SpecialFunctions. But now I'm not so sure. I can not immediately find it in the spec. So could definitely make sense.

Be aware, aggregation is very statefull. There is some shenanigans in the AggregateEvaluator to provide a nice API (replacing functions on the first call), both outwards and inwards. Take care when implementing an async alternative.

The BaseAggregators should be able to be reused.

jitsedesmet commented 3 years ago

Thank you for the clarification.

jitsedesmet commented 3 years ago

It is clear that an AsyncAggregateEvaluator should have some locking or transaction mechanism in place to handle certain scenarios.

Within the BaseAggregators I see two aggregators that might have unexpected async behavior. This being GroupConcat and Sample, GroupConcat because the aggregator is defined by an order. Sample forms a problem to because it will return the first term. I don't know how we can handle the unexpected behavior of GroupConcat. Either we throw an error if we notice multiple puts are called at the same time or, we expect the user to await every put.

To handle all these problems we could also expect the user to use await between every call to the AyncAggregateEvaluator but this raises the question on why to make one in the first place?

I'm open to suggestions on how to tackle these problems.

rubensworks commented 3 years ago

Result should be async

👍 Since everything else in Comunica is async, this shouldn't be a problem.

Either we throw an error if we notice multiple puts are called at the same time

We definitely do not want this

we expect the user to await every put.

Sounds reasonable to me

this raises the question on why to make one in the first place

Not sure what you mean by this

jitsedesmet commented 3 years ago

So, we expect a user using the aggregates GroupConcat and Sample are responsible for using await between every put so the behavior of these aggregators is as expected? Than all that's left is finding a way to use transactions for the mentioned functions.

rubensworks commented 3 years ago

for using await between every put

No, I'd make put async, and await it when it's invoked.

jitsedesmet commented 3 years ago

I don't get that. Assuming we use a transaction system on put so only one put is can be executed at a time, a user writing

await Promise.all([aggreg.put(a), aggreg.put(b), aggreg.put(c)]);
const res = await aggreg.result();

where aggreg is a GroupConcat migt expect a, b, c. But, since it is not clear what will exectute first it might be b, a, c. So than if a user wants a, b, c. We can expect him to write

aggreg.put(a);
aggreg.put(b);
aggreg.put(c);

right?

rubensworks commented 3 years ago

I think this will be required:

await aggreg.put(a);
await aggreg.put(b);
await aggreg.put(c);
jitsedesmet commented 3 years ago

Yes, of course, I forgot the awaits.

jitsedesmet commented 3 years ago

After further research about how JS works and finding out that it doesn't have the concurrency problems other languages have. I need to revise my previous message.

jitsedesmet commented 3 years ago

This issue has been handled in the pull requests. I will close it.