semsol / arc2

ARC RDF Classes for PHP
Other
331 stars 92 forks source link

Multiple LOAD statements - some data not returned in subsequent SELECT queries #122

Open cbinding opened 5 years ago

cbinding commented 5 years ago

For multiple LOAD statements the data is loaded to the triple store but only a subset of data is then retrieved in subsequent SELECT queries. To test:

File graph1.rdf:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
    xmlns:skos="http://www.w3.org/2004/02/skos/core#">
    <skos:Concept rdf:about="http://tempuri/1"/>
</rdf:RDF>

File graph2.rdf:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
    xmlns:skos="http://www.w3.org/2004/02/skos/core#">
    <skos:Concept rdf:about="http://tempuri/2"/>
</rdf:RDF>

Test:

$store = ARC2::getStore([...]);
$store->query ('LOAD <http://path/to/graph1.rdf>');
$store->query ('LOAD <http://path/to/graph2.rdf>');

# test1 - confirms the 2 triples from the loaded files have been imported
$rs = $store->query ('SELECT * WHERE { ?s ?p ?o }');
echo count($rs['result']['rows']) . " triples present\n";

# test2 - expecting 2 results, only got 1
$rs = $store->query ('SELECT * WHERE { ?uri a <http://www.w3.org/2004/02/skos/core#Concept> }');
echo count($rs['result']['rows']); # expected 2 results, only got 1 
# print_r($rs); shows that it is the item from the first loaded file (http://tempuri/1) 

This issue is related to issue #114 - the fix is in ARC2_StoreLoadQueryHandler.php (line 228):

# old: 
if (false !== empty($binaryValue)) {
# new:
if (false == empty($binaryValue)) {

This code is locating the ID for a previously inserted value, but the condition logic means the ID is never found. URI values may therefore get inserted multiple times with different IDs leading to the inconsistencies observed, where inserted/loaded data is present in the triple store but not subsequently retrieved.