Closed syphax-bouazzouni closed 2 months ago
Hi @mdorf, I think you are the one working on this part of the project. Could you please give me your opinion about this?
Hi @mdorf, I think you are the one working on this part of the project. Could you please give me your opinion about this?
@syphax-bouazzouni, the solution appears to be sound based on the description. But, honestly, it's a bit difficult for me to provide a constructive feedback based on the number of commits. I need to understand how the original issue affected our system and how the solution addresses it. I wonder if this is a good topic to review during our face-to-face workshop in Montpellier in September.
Also, we need to make sure that ncbo/ontologies_linked_data and ncbo/ontologies_api tests pass with AllegroGraph with this change in place.
Also, we need to make sure that ncbo/ontologies_linked_data and ncbo/ontologies_api tests pass with AllegroGraph with this change in place.
I do agree in principle to this. However, I don't think this should be a stopper (or slower) for not taking the change for 4store. This issue (of the cartesian product) has been raised 5 years ago with @vemonet and we have here the opportunity to fix this.
Basically we expect this to fix the call : https://data.bioontology.org/submissions and possibly even https://data.bioontology.org/submissions?display=all
@syphax-bouazzouni, the solution appears to be sound based on the description. But, honestly, it's a bit difficult for me to provide a constructive feedback based on the number of commits. I need to understand how the original issue affected our system and how the solution addresses it. I wonder if this is a good topic to review during our face-to-face workshop in Montpellier in September.
Yeah, I understand. For this kind of matter, I am totally available to do specific meetings to speak about it. But yes we could also speak about it during the Ontoportal meeting, even though during that meeting I would prefer personally to speak more in a larger overview of our state and the improvements that we can do.
Also, we need to make sure that ncbo/ontologies_linked_data and ncbo/ontologies_api tests pass with AllegroGraph with this change in place.
For AllegroGraph, we have not yet set up an environment to test it, but naively I will say that it should also work for it because its basic SPARQL queries that didn't change a lot from before.
I do agree in principle to this. However, I don't think this should be a stopper (or slower) for not taking the change for 4store. This issue (of the cartesian product) has been raised 5 years ago with @vemonet and we have here the opportunity to fix this.
@jonquet for now the code for 4store and AllegroGraph are the same, so we can't isolate this improvement for only 4store
Cc @vemonet with who we originally found the bug! And congrats to @syphax-bouazzouni for such an investigation and fix.
Wow nice to see it finally got resolved for everyone! Good job @syphax-bouazzouni, that was a consequent task
Preamble
This PR includes this
And require this
Starting issues
The generated SPARQL query to fetch data is not optimized because in the select clause we print all the attributes which causes the result to have a lot of rows (the number of rows is equal to the multiplication of the number of results of each attribute)
For example, if we have for a resource (e.g a Class) 3 attributes (e.g prefLabel, synonyms, definitions) with the following values: prefLabel = prefLabel synonyms = [syn_1, syn_2, syn_3] definitions = [def_1, def_2]
We will get the following request results :
| prefLabel | syn_1 | def_1 | | prefLabel | syn_2 | def_1 | | prefLabel | syn_3 | def_1 | | prefLabel | syn_1 | def_2 | | prefLabel | syn_2 | def_2 | | prefLabel | syn_3 | def_2 |
With this new implementation, we will have the following results prefLabel syn_1 syn_2 syn_3 def_1 def_2
Solution
We will transform all the requests to
And for inverse attributes, we will use the following request
Solution issue
Theoretically, it should work, but the goo test_inverse.rb doesn’t pass because of a bug of 4store, where the ?attributeproperty was always returning
resource not found
To reproduce the bug
Alternative and implemented solution for now
We will use UNION with binds for each attribute of a resource, as so
Include/general query
Aggregate query
unchanged
BNODE query
unchanged
FILTER query (
test/test_where#test_filter
)ORDER BY query (TODO)
unmapped query (
test/test_where#test_filter
)before
new it is the same as the general/includes a query
Equivalent predicates case
see `def self.expand_equivalent_predicates(query, eq_p)`
Before the filter was in the OPTIONAL clause
Now we add a UNION clause for each equivalent predicate
Use case
With the use case of AGRVOC (2GB of triples) :