I'm attempting to add triples to a TripleRush instance and either these additions don't happen correctly, or the queries fail to run correctly on the data. See code below in order to reproduce the scenario.
What did you expect to happen?
I expect to count 25700 unique triples after loading university 0 of LUBM-1.
What did actually happen?
The count is sometimes smaller than 25700.
What are possible reasons that might have caused this unexpected behavior?
Query execution does not correctly count the number of triples that are in the store
The index is sometimes not updated correctly during the edge/vertex additions triggered by a triple addition
Blocking triple additions sometimes return before the index has actually been updated
What possible causes are you investigating? Tip: prioritize causes that are easy to check and ones that are likely correct.
[ ] Query execution does not correctly count the number of triples that are in the store
I think this is really it:
When counting with println(tr.countVerticesByType), then the number of PIndex vertices seems to match, but the number of retrieved bindings is too low (this is on triple-addition-diagnostics branch).
I initially thought I could exclude this possibility, because the error only appeared with one kind of edge addition and query execution is supposed to be the same no matter how a triple was added.
[ ] The index is sometimes not updated correctly during the edge/vertex additions triggered by a triple addition
[ ] Blocking triple additions sometimes return before the index has actually been updated
The error persists even if I wait for 10 seconds before the query is triggered. So either this is not the cause, the update takes longer than those extra 10 seconds, or some complex interaction with an unflushed message bus might cause this.
Code that allows to reproduce the issue:
package com.signalcollect.triplerush.loading
import java.io.File
import org.scalatest.{ Finders, FlatSpec }
import org.scalatest.concurrent.ScalaFutures
import com.signalcollect.triplerush.{ TriplePattern, TripleRush }
import com.signalcollect.triplerush.dictionary.HashDictionary
import com.signalcollect.util.TestAnnouncements
class BlockingAdditionsSpec extends FlatSpec with TestAnnouncements with ScalaFutures {
"Blocking additions" should "correctly load triples from a file" in {
//fastStart = true,
val tr = TripleRush(dictionary = new HashDictionary())
//val tr = TripleRush(dictionary = new ModularDictionary())
try {
val filePath = s".${File.separator}lubm${File.separator}university0_0.nt"
println(s"Loading file $filePath ...")
//tr.loadFromFile(filePath)
val howMany = 25700
//.take(howMany)
tr.prepareExecution
tr.addTriples(TripleIterator(filePath), blocking = true)
tr.awaitIdle
println(tr.dictionary)
//tr.awaitIdle()
//val count = tr.resultIteratorForQuery(Seq(TriplePattern(-1, -2, -3))).size
// assert(count == 25700)
// val countOptionFuture = tr.executeCountingQuery(Seq(TriplePattern(-1, -2, -3)))
// whenReady(countOptionFuture) { countOption =>
// assert(countOption == Some(howMany))
// }
println(tr.resultIteratorForQuery(Seq(TriplePattern(-1, -2, -3))).size)
println(tr.countVerticesByType)
println(tr.edgesPerIndexType)
} finally {
tr.shutdown
}
}
}
What were you doing?
I'm attempting to add triples to a TripleRush instance and either these additions don't happen correctly, or the queries fail to run correctly on the data. See code below in order to reproduce the scenario.
What did you expect to happen?
I expect to count 25700 unique triples after loading university 0 of LUBM-1.
What did actually happen?
The count is sometimes smaller than 25700.
What are possible reasons that might have caused this unexpected behavior?
What possible causes are you investigating? Tip: prioritize causes that are easy to check and ones that are likely correct.
Query execution does not correctly count the number of triples that are in the store
I think this is really it: When counting withprintln(tr.countVerticesByType)
, then the number of PIndex vertices seems to match, but the number of retrieved bindings is too low (this is ontriple-addition-diagnostics
branch). I initially thought I could exclude this possibility, because the error only appeared with one kind of edge addition and query execution is supposed to be the same no matter how a triple was added.The index is sometimes not updated correctly during the edge/vertex additions triggered by a triple addition
Blocking triple additions sometimes return before the index has actually been updated
The error persists even if I wait for 10 seconds before the query is triggered. So either this is not the cause, the update takes longer than those extra 10 seconds, or some complex interaction with an unflushed message bus might cause this.Code that allows to reproduce the issue: