hobbit-project / platform

HOBBIT benchmarking platform
GNU General Public License v2.0
23 stars 9 forks source link

Debugging result models #255

Open smirnp opened 6 years ago

smirnp commented 6 years ago

Hi!

Is there any way for fast checking correctness of result model for a particular benchmark? I receive the same error (the query presented below) and now use the only way to check it - via manual runs in GUI. May checking might be done via some unit tests or any other way? Thanks!

2018-04-03 13:13:15,121 INFO [org.hobbit.storage.service.StorageService] - <Received a request to call the SPARQL Endpoint at http://vos:8890/sparql-auth and execute the following query: PREFIX  hobbit: <http://w3id.org/hobbit/vocab#> PREFIX  ex:   <http://example.org/> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX  owl:  <http://www.w3.org/2002/07/owl#> PREFIX  gerbil: <http://w3id.org/gerbil/vocab#> PREFIX  xsd:  <http://www.w3.org/2001/XMLSchema#> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>  WITH <http://hobbit.org/graphs/PublicResults> INSERT {   <http://project-hobbit.eu/sml-benchmark-v2/meanErrorMin> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/meanErrorMin> rdfs:label "Mean error, minutes (Q2.A)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/meanErrorMin> rdfs:range xsd:double .
   <http://project-hobbit.eu/sml-benchmark-v2/totalRank> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/totalRank> rdfs:label "Total rank"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/totalRank> rdfs:range xsd:double .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> rdf:type hobbit:ConfigurableParameter .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> rdfs:label "Tuples limit (0 is unlimited)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> rdfs:range xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> hobbit:defaultValue "0"^^xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/system-adapter2> rdf:type hobbit:SystemInstance .
   <http://project-hobbit.eu/sml-benchmark-v2/system-adapter2> rdfs:label "Dummy system2 for SMLv2 (DEBS 2018) benchmark"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/system-adapter2> rdfs:comment "Dummy system2 for SMLv2 (DEBS 2018) benchmark (http://project-hobbit.eu/sml-benchmark-v2/system-adapter2 smirnp/sml-benchmark-v2/system-adapter)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/system-adapter2> hobbit:implementsAPI <http://project-hobbit.eu/sml-benchmark-v2/API> .
   _:b0 rdf:type hobbit:Experiment .
   _:b0 <http://project-hobbit.eu/sml-benchmark-v2/averageEarlynessRate> 0.0E0 .
   _:b0 <http://project-hobbit.eu/sml-benchmark-v2/averageLatencyMs> "7"^^xsd:long .
   _:b0 <http://project-hobbit.eu/sml-benchmark-v2/evaluatedPairsCount> "49959"^^xsd:int .
   _:b0 <http://project-hobbit.eu/sml-benchmark-v2/meanErrorMin> -1.0E0 .
   _:b0 <http://project-hobbit.eu/sml-benchmark-v2/systemWorkingTimeSeconds> 2.0E1 .
   _:b0 hobbit:involvesBenchmark "http://example.com/benchmark1" .
   _:b0 hobbit:involvesSystemInstance "http://example.com/system1" .
   <http://w3id.org/hobbit/experiments#1522761033233> rdf:type hobbit:Experiment .
   <http://w3id.org/hobbit/experiments#1522761033233> <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> "50000"^^xsd:int .
   <http://w3id.org/hobbit/experiments#1522761033233> <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> "10"^^xsd:int .
   <http://w3id.org/hobbit/experiments#1522761033233> <http://project-hobbit.eu/sml-benchmark-v2/queryType> "1"^^xsd:int .
   <http://w3id.org/hobbit/experiments#1522761033233> hobbit:involvesBenchmark <http://project-hobbit.eu/sml-benchmark-v2/benchmark> .
   <http://w3id.org/hobbit/experiments#1522761033233> hobbit:involvesSystemInstance <http://project-hobbit.eu/sml-benchmark-v2/system-adapter2> .
   <http://w3id.org/hobbit/experiments#1522761033233> hobbit:hobbitPlatformVersion "2.0.2"@en .
   <http://w3id.org/hobbit/experiments#1522761033233> hobbit:startTime "2018-04-03T13:10:39.851Z"^^xsd:dateTime .
   _:b1 rdf:type hobbit:Experiment .
   <http://project-hobbit.eu/sml-benchmark-v2/averageLatencyMs> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/averageLatencyMs> rdfs:label "Average latency, ms"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/averageLatencyMs> rdfs:range xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/averageEarlynessRate> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/averageEarlynessRate> rdfs:label "Average earlyness rate (Q1.A)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/averageEarlynessRate> rdfs:range xsd:double .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:measuresKPI <http://project-hobbit.eu/sml-benchmark-v2/averageLatencyMs> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> rdfs:label "DEBS GC 2018 Benchmark"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> rdf:type hobbit:Benchmark .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:hasParameter <http://project-hobbit.eu/sml-benchmark-v2/generatorLimit> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:hasParameter <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:measuresKPI <http://project-hobbit.eu/sml-benchmark-v2/evaluatedPairsCount> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:measuresKPI <http://project-hobbit.eu/sml-benchmark-v2/averageEarlynessRate> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:measuresKPI <http://project-hobbit.eu/sml-benchmark-v2/meanErrorMin> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:version "v1.0"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:hasAPI <http://project-hobbit.eu/sml-benchmark-v2/API> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:hasParameter <http://project-hobbit.eu/sml-benchmark-v2/queryType> .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> rdfs:comment "Stream machine learning benchmark v2 for the DEBS GC 2018--"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/benchmark> hobbit:measuresKPI <http://project-hobbit.eu/sml-benchmark-v2/systemWorkingTimeSeconds> .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> rdf:type hobbit:ConfigurableParameter .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> rdf:type hobbit:ForwardedParameter .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> rdfs:label "QueryType (1 - arrival port names, 2 - port names,arrival timestamps)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> rdfs:label "Query type (1 - arrival port names, 2 - port names,arrival timestamps)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> rdfs:range xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/queryType> hobbit:defaultValue "1"^^xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> rdf:type hobbit:ConfigurableParameter .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> rdfs:label "Generator timeout, min (0 is unlimited)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> rdfs:range xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/generatorTimeoutMin> hobbit:defaultValue "0"^^xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/evaluatedPairsCount> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/evaluatedPairsCount> rdfs:label "Pairs evaluated"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/evaluatedPairsCount> rdfs:range xsd:int .
   <http://project-hobbit.eu/sml-benchmark-v2/systemWorkingTimeSeconds> rdf:type hobbit:KPI .
   <http://project-hobbit.eu/sml-benchmark-v2/systemWorkingTimeSeconds> rdfs:label "System working time, seconds (B)"@en .
   <http://project-hobbit.eu/sml-benchmark-v2/systemWorkingTimeSeconds> rdfs:range xsd:int .
 } WHERE   {} >
MichaelRoeder commented 6 years ago

No, there is no real way to check a result model. What exactly do you want to check?

Btw. the model you provide above seems to be faulty. It comprises 3 experiments http://w3id.org/hobbit/experiments#1522761033233, _:b0 and _:b1. The two latter are blank nodes and should not be used for experiments. I assume that the piece of code that stores the KPI values in the result model is not correct and generates these blank nodes instead of using the experiment resource. Do you have the piece of code available in github?

smirnp commented 6 years ago

Thank you for help! I have found the problem of my blank nodes (missing ExperimentURI for EvalModule), which I'm calling directly from EvalStorage (to avoid unnecessary stream) without the init() method.

I wanted to find a way how to automatically validate the result model against the benchmark model described in benchmark.ttl. While the SDK solves the most of the development problems locally, final integration after upload to the platform (validity of a result model, its compatibility with benchmark.ttl) is still a pain even for me :)

MichaelRoeder commented 6 years ago

The only validation that I can think of is the following:

  1. Execute the benchmark in the SDK using a random HOBBIT ID and receive the result model from the benchmark
  2. Load the KPIs and parameters from the benchmark.ttl file
  3. Generate the experiment URI based on the random HOBBIT ID
  4. Check that for every KPI k and the experiment resource e, there is at least one triple e k o where o should be of the type defined for k.
  5. Check that the resource e has no additional triples attached that are not representing a KPI or a parameter.

What do you think?