Open Yotlan opened 1 year ago
To avoid this error, we can simply add another condition to the following conditional statement:
elif not(lowSelectivityLeft) and lowSelectivityRight and (not(isinstance(l, TreePlan)) or not(l.operator.__class__.__name__ == "NestedHashJoinFilter" )) and (not(isinstance(r,TreePlan)) or not(r.operator.__class__.__name__ == "Xgjoin" or r.operator.__class__.__name__ == "NestedHashJoinFilter")):
For example for the HashJoin operator who should not going in this conditional statement, we can add these conditions:
not(l.operator.__class__.__name__ == "HashJoin") and not(r.operator.__class__.__name__ == "HashJoin")
To each of the left and right operator. In the case of HashJoin, if we add these condition to the conditional statement, all the queries who return an AttributeError, timeout (because ANAPSID do his work and construct all the join and merge intermediate result).
To avoid this error when we have an AttributeError related to IndependantOperator, it's important to note that IndependantOperator not have operator members. So we need to add in the first conditional statement who was the following conditional statement:
if not(lowSelectivityLeft) and lowSelectivityRight and not(isinstance(r, TreePlan)):
The following condition:
not(l.__class__.__name__ == "IndependantOperator") and not(r.__class__.__name__ == "IndependantOperator")
In this query:
SELECT DISTINCT ?property ?hasValue ?isValueOf WHERE {
<http://www.vendor6.fr/Offer886> <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/product> ?product .
{ <http://www.vendor6.fr/Offer886> ?property ?hasValue }
UNION
{ ?isValueOf ?property <http://www.vendor6.fr/Offer886> }
}
the first triple pattern was a by-product of the query instantiation process and should not be there after the ?offer
variable has been injected. This triple asks for all ?product
that is offered by Offer886
, and since ?product
is not a join variable, it will produce a Cartesian product with other tps.
Can you try removing this tp in your test query and see if it still work?
When we launch this query (the q11's query without the first triple) this query work.
SELECT DISTINCT ?property ?hasValue ?isValueOf WHERE {
{ <http://www.vendor6.fr/Offer886> ?property ?hasValue }
UNION
{ ?isValueOf ?property <http://www.vendor6.fr/Offer886> }
}
And give the following results:
{'property': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'hasValue': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/Offer', 'isValueOf': ''}{'property': 'http://www.w3.org/2002/07/owl#sameAs', 'hasValue': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/Offer886', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/deliveryDays', 'hasValue': '4^^<http://www.w3.org/2001/XMLSchema#integer>', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/offerWebpage', 'hasValue': "entomology Heliopolis comportment's rosebushes twentieth's Reba Americanization's poetesses Shintos", 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/price', 'hasValue': '4065.84^^<http://www.w3.org/2001/XMLSchema#double>', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/product', 'hasValue': 'http://www.vendor6.fr/Product55489', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/publishDate', 'hasValue': '2008-02-24T00:00:00^^<http://www.w3.org/2001/XMLSchema#dateTime>', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/validFrom', 'hasValue': '2008-02-20T00:00:00^^<http://www.w3.org/2001/XMLSchema#dateTime>', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/validTo', 'hasValue': '2008-05-31T00:00:00^^<http://www.w3.org/2001/XMLSchema#dateTime>', 'isValueOf': ''}{'property': 'http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/vendor', 'hasValue': 'http://www.vendor6.fr/Vendor0', 'isValueOf': ''}
I have another hypothesis: other engines returned results for this query, so I think this also reveals a weakness in ANAPSID where they could not handle Cartesian Product.
Could you test with this query?
SELECT * WHERE {
<http://www.vendor6.fr/Offer886> <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/product> ?product1 .
<http://www.vendor6.fr/Offer886> <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/product> ?product2 .
}
This query return the following results:
{'product1': 'http://www.vendor6.fr/Product55489', 'product2': 'http://www.vendor6.fr/Product55489'}
It's not the Cartesian Product the problem, I have tried to remove the UNION structure and it works. You can run with the rest of the workload without q11 for now.
What we want ?
Sometime, ANAPSID return an AttributeError when we launch some queries like q11, and we want to avoid this error who can appear when launching some queries.
What happens ?
Like we said earlier, in the case we have some unwanted operator, ANAPSID return an AttributeError.
Where ?
In method
includePhysicalOperatorJoin
in Plan.py, there is the following condition:In the second elif of this conditional statement, we can have some operator like HashJoin. But HashJoin have no vars_left and should not go in this statement, but in the last case. Moreover, we can have IndependantOperator who should never go in this statement because ANAPSID treat his case after this huge conditional statement.
How to reproduce ?
You can launch a random q11's queries like for example the following query:
And you'll see the AttributeError.