GDD-Nantes / FedShop

Code for FedShop: The Federated Shop Benchmark
GNU General Public License v3.0
8 stars 0 forks source link

Join Order Optimization for q05 #18

Closed mhoangvslev closed 1 year ago

mhoangvslev commented 1 year ago

Given the query below:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT DISTINCT ?product ?localProductLabel
WHERE { 
    ?localProduct rdfs:label ?localProductLabel .

    ?localProduct bsbm:productFeature ?localProdFeature .
    ?localProduct bsbm:productPropertyNumeric1 ?simProperty1 .
    ?localProduct bsbm:productPropertyNumeric2 ?simProperty2 .    

    ?localProduct owl:sameAs ?product .
    ?localProdFeature owl:sameAs ?prodFeature .

    ?localProductXYZ bsbm:productFeature ?localProdFeatureXYZ .
    ?localProductXYZ bsbm:productPropertyNumeric1 ?origProperty1 .
    ?localProductXYZ bsbm:productPropertyNumeric2 ?origProperty2 .

    # const ?ProductXYZ
    ?localProductXYZ owl:sameAs ?ProductXYZ .
    ?localProdFeatureXYZ owl:sameAs ?prodFeature .

    FILTER(?ProductXYZ != ?product)
    # Values are pre-determined because we knew the boundaries from the normal distribution
    FILTER(?simProperty1 < (?origProperty1 + 20) && ?simProperty1 > (?origProperty1 - 20))
    FILTER(?simProperty2 < (?origProperty2 + 70) && ?simProperty2 > (?origProperty2 - 70))
}
ORDER BY ?localProductLabel
LIMIT 5

This query takes forever to evaluate, even on batch 0. Some join order optimization is needed here.

mhoangvslev commented 1 year ago

TODO:

Optimised queries

Known strategies that work:

q05: The consumer has found a product that fulfills his requirements. He now wants to find products with similar features.

DEFINE sql:select-option "order"

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT DISTINCT ?product ?localProductLabel
WHERE {
    {
        # const ?ProductXYZ
        ?localProductXYZ owl:sameAs ?ProductXYZ  .
        ?localProductXYZ bsbm:productFeature ?localProdFeatureXYZ . 
        ?localProdFeatureXYZ owl:sameAs ?prodFeature .
        ?localProductXYZ bsbm:productPropertyNumeric1 ?origProperty1  .
        ?localProductXYZ bsbm:productPropertyNumeric2 ?origProperty2  .
    } .

    {
        ?localProduct owl:sameAs ?product  .
        FILTER (?ProductXYZ != ?product)        
        ?localProduct rdfs:label ?localProductLabel  .
        ?localProduct bsbm:productFeature ?localProdFeature  .
        ?localProdFeature owl:sameAs ?prodFeature .
        ?localProduct bsbm:productPropertyNumeric1 ?simProperty1  .
        ?localProduct bsbm:productPropertyNumeric2 ?simProperty2  .    
    } .

    # Values are pre-determined because we knew the boundaries from the normal distribution
    FILTER(?simProperty1 < (?origProperty1 + 20) && ?simProperty1 > (?origProperty1 - 20))
    FILTER(?simProperty2 < (?origProperty2 + 70) && ?simProperty2 > (?origProperty2 - 70))

}
# ORDER BY ?localProductLabel
LIMIT 5