AKSW / Erfurt

PHP5 / Zend based Semantic Web API for Social Semantic Software
41 stars 27 forks source link

Implement Stardog as Backend Adapter #122

Closed simeonackermann closed 8 years ago

simeonackermann commented 8 years ago

This implements Stardog as new backend adapter, originally inspired by @Matthimatiker in #86

It uses Guzzle 5.3 as HTTP client to request Stardog via HTTP Interface.

All operations as used for Virtuoso to query the store, add or remove statements should be supported. If versioning is enabled (default) an additional SQL ZendDB store is required.

The implemented integration test (make test-integration-stardog) completes with success.

Stardog can be installed as Docker Container from https://github.com/Dockerizing/Stardog

config.ini for Erfurt with Stardog should look like:

store.backend = stardog
versioning = true

; Stardog backend
store.stardog.base_url   = http://localhost:5820
store.stardog.username   = admin
store.stardog.password   = admin
store.stardog.database   = erfurt

; ZendDB
store.zenddb.dbname   = erfurt
store.zenddb.username = root
store.zenddb.password = password
store.zenddb.dbtype   = mysql
store.zenddb.host     = localhost
k00ni commented 8 years ago

Guzzle installation seems to fail in Travis, can you please look into that @simeonackermann ?

simeonackermann commented 8 years ago

Guzzle requires PHP >= 5.5.0, but Travis was used with 5.4. Could this be updated?

k00ni commented 8 years ago

Maybe we use Guzzle in version 5.3.1 for now? It supports PHP 5.4.

simeonackermann commented 8 years ago

Unfortunately its not compatible with current implementation, I have to change some methods first...

simeonackermann commented 8 years ago

Could you please recheck Travis?

white-gecko commented 8 years ago

Travis didn't build this pull request, maybe you have to resolve the merge conflict first. For instance by re-basing on the current develop.

simeonackermann commented 8 years ago

Ok, thanks for the hint, Travis build with success :)

k00ni commented 8 years ago

@simeonackermann: Please fix the conflicts (again) so that we can merge your code.

pfrischmuth commented 8 years ago

Can you please rebase your commits on the current develop branch instead of merging it in? At least we want to get rid of the "Merge..." commits, but maybe you can also rework your commits into logical chunks?

Use rebase -i develop for this purpose (assuming that your local develop branch is up to date with the AKSW/Erfurt develop branch.

pfrischmuth commented 8 years ago

Please take the following actions:

Required for merge

Optional

k00ni commented 8 years ago

@simeonackermann: Please fix mentioned points of @pfrischmuth before start researching the elevator task.

simeonackermann commented 8 years ago

@pfrischmuth : The integration test is mainly taken from virtuoso integration test and php codesniffer marks too long lines as:

$value    = <<<EOT
Over the past 3 years, the semantic web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of intelligent information management: the exploitation of the Web as a platform for data and information integration in addition to document search. To translate this initial success into a world-scale disruptive reality, encompassing the Web 2.0 world and enterprise data alike, the following research challenges need to be addressed: improve coherence and quality of data published on the Web, close the performance gap between relational and RDF data management, establish trust on the Linked Data Web and generally lower the entrance barrier for data publishers and users. With partners among those who initiated and strongly supported the Linked Open Data initiative, the LOD2 project aims at tackling these challenges by developing:
<ol>
<li>enterprise-ready tools and methodologies for exposing and managing very large amounts of structured information on the Data Web,</li>
<li>a testbed and bootstrap network of high-quality multi-domain, multi-lingual ontologies from sources such as Wikipedia and OpenStreetMap.</li>
<li>algorithms based on machine learning for automatically interlinking and fusing data from the Web.</li>
<li>standards and methods for reliably tracking provenance, ensuring privacy and data security as well as for assessing the quality of information.</li>
<li>adaptive tools for searching, browsing, and authoring of Linked Data.</li>
</ol>
We will integrate and syndicate linked data with large-scale, existing applications and showcase the benefits in the three application scenarios of media & publishing, corporate data intranets and eGovernment. The resulting tools, methods and data sets have the potential to change the Web as we know it today.
EOT;

A workaround might be:

$value    = 'Over the past 3 years, the semantic web activity has gained momentum with the widespread '.
            'publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical '.
            'research idea into a very promising candidate for addressing one of the biggest challenges in the area '.
            'of intelligent information management: the exploitation of the Web as a platform for data and '.
            'information integration in addition to document search. To translate this initial success into a '.
            'world-scale disruptive reality, encompassing the Web 2.0 world and enterprise data alike, the following '.
            'research challenges need to be addressed: improve coherence and quality of data published on the Web, '.
            'close the performance gap between relational and RDF data management, establish trust on the Linked Data '.
            'Web and generally lower the entrance barrier for data publishers and users. With partners among those '.
            'who initiated and strongly supported the Linked Open Data initiative, the LOD2 project aims at tackling '.
            'these challenges by developing:'.PHP_EOL.
            '<ol>'.PHP_EOL.
            '<li>enterprise-ready tools and methodologies for exposing and managing very large amounts of structured '.
                'information on the Data Web,</li>'.PHP_EOL.
            '<li>a testbed and bootstrap network of high-quality multi-domain, multi-lingual ontologies from sources '.
                'such as Wikipedia and OpenStreetMap.</li>'.PHP_EOL.
            '<li>algorithms based on machine learning for automatically interlinking and fusing data from the '.
                'Web.</li>'.PHP_EOL.
            '<li>standards and methods for reliably tracking provenance, ensuring privacy and data security as well '.
                'as for assessing the quality of information.</li>'.PHP_EOL.
            '<li>adaptive tools for searching, browsing, and authoring of Linked Data.</li>'.PHP_EOL.
            '</ol>'.PHP_EOL.
            'We will integrate and syndicate linked data with large-scale, existing applications and showcase the '.
            'benefits in the three application scenarios of media & publishing, corporate data intranets and '.
            'eGovernment. The resulting tools, methods and data sets have the potential to change the Web as we '.
            'know it today.';

But I think this is not really better to read...

k00ni commented 8 years ago

But I think this is not really better to read...

Maybe not better, but now its align with the coding standard. And if you want to read it, you can do that on a fixed screen area (max. 120 chars in the width?). In the older version you need to scroll far to the right, which is very annoying.

pfrischmuth commented 8 years ago

Final required actions before merge

public function setUp()
{
    $mock = new MockHandler([
        new Response(200, ['X-Foo' => 'Bar']),
        new Response(202, ['Content-Length' => 0]),
        new RequestException("Error Communicating with Server", new Request('GET', 'test'))
    ]);
    $handler = HandlerStack::create($mock);

    $config = $this->getTestConfig()->store->stardog->toArray();
    $config['handler'] = $handler;

    $this->fixture = new Erfurt_Store_Adapter_Stardog_ApiClient($config);
}
simeonackermann commented 8 years ago

Hm, my integration tests completes OK, with: Tests: 68, Assertions: 219, Skipped: 16 and no errors for me...

simeonackermann commented 8 years ago

Could you please post the test error log?

pfrischmuth commented 8 years ago

When I run the tests, the output is not deterministic... the number of errors/failures differ from time to time.

I run stardog and mysql in a docker and run the test with docker also.

The commnad is: docker run --rm -vpwd:/var/www -e EF_STORE_ADAPTER=stardog --link ontowiki-devenv-stardog-test:stardog-test --link ontowiki-devenv-mysql-test:mysql-test --net devenv_default ontowiki-devenv/phpserver /bin/sh -c 'cd /var/www && php vendor/bin/phpunit --testsuite "Erfurt Integration Tests"'

The output is:

PHPUnit 4.5.1 by Sebastian Bergmann and contributors.

Configuration read from /var/www/phpunit.xml

........S...............ES.S..SSSS.E.S...SSSSSSS..S.SS.E......... 65 / 68 ( 95%)
...

Time: 32.08 seconds, Memory: 25.75MB

There were 3 errors:

1) Erfurt_Rdf_ModelIntegrationTest::testSetGetOption
Erfurt_Store_Exception: Failed creating the model.

/var/www/library/Erfurt/Store.php:1155
/var/www/tests/integration/Erfurt/Rdf/ModelIntegrationTest.php:126

2) Erfurt_Store_Adapter_SparqlIntegrationTest::testSparqlWithDbPediaEndpoint
Erfurt_Store_Adapter_Exception: SPARQL Error with query: SELECT DISTINCT ?g { GRAPH ?g { ?s ?p ?o . } }

/var/www/library/Erfurt/Store/Adapter/Stardog/ApiClient.php:193
/var/www/library/Erfurt/Store/Adapter/Stardog.php:341
/var/www/library/Erfurt/Store/Adapter/Stardog.php:356
/var/www/library/Erfurt/Store.php:1393
/var/www/library/Erfurt/Store.php:1039
/var/www/library/Erfurt/Store.php:1166
/var/www/library/Erfurt/Store.php:428
/var/www/library/Erfurt/Store.php:1046
/var/www/library/Erfurt/App.php:547
/var/www/library/Erfurt/Store.php:758
/var/www/tests/unit/Erfurt/TestCase.php:32

3) Erfurt_StoreIntegrationTest::testSparqlQueryWithSpecialCharUriIssue579
Erfurt_Store_Adapter_Exception: SPARQL Error with query: DROP SILENT GRAPH <http://ns.ontowiki.net/SysOnt/>

/var/www/library/Erfurt/Store/Adapter/Stardog/ApiClient.php:193
/var/www/library/Erfurt/Store/Adapter/Stardog.php:254
/var/www/library/Erfurt/Store.php:745
/var/www/tests/unit/Erfurt/TestCase.php:33

--

There were 18 skipped tests:

1) Erfurt_AppIntegrationTest::testGetStoreWithCleanDatabase
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/unit/Erfurt/TestCase.php:118
/var/www/tests/integration/Erfurt/AppIntegrationTest.php:113

2) Erfurt_Rdf_ModelIntegrationTest::testIsEditableWithZendDbAndAnonymousUserIssue774
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/Rdf/ModelIntegrationTest.php:143

3) Erfurt_Rdf_ResourceIntegrationTest::testGetQualifiedName
An Erfurt_Store_Exception occurred when establishing a connection: SPARQL Error with query: SELECT DISTINCT ?g { GRAPH ?g { ?s ?p ?o . } }

/var/www/tests/unit/Erfurt/TestCase.php:102
/var/www/tests/integration/Erfurt/Rdf/ResourceIntegrationTest.php:30

4) Erfurt_Sparql_EngineDbIntegrationTest::testOdFmiLimitQueryWithZendDbIssue782WithLimit
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/Sparql/EngineDbIntegrationTest.php:13

5) Erfurt_Sparql_EngineDbIntegrationTest::testOdFmiLimitQueryWithZendDbIssue782WithoutLimit
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/Sparql/EngineDbIntegrationTest.php:60

6) Erfurt_Store_Adapter_EfZendDbIntegrationTest::testSparqlQuery
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/Store/Adapter/EfZendDbIntegrationTest.php:8

7) Erfurt_Store_Adapter_EfZendDbIntegrationTest::testSerialization
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/Store/Adapter/EfZendDbIntegrationTest.php:8

8) Erfurt_Store_Adapter_StardogIntegrationTest::testAddStatementWithUriObject
An Erfurt_Store_Exception occurred when establishing a connection: SPARQL Error with query: SELECT DISTINCT ?g { GRAPH ?g { ?s ?p ?o . } }

/var/www/tests/unit/Erfurt/TestCase.php:102
/var/www/tests/unit/Erfurt/TestCase.php:177
/var/www/tests/integration/Erfurt/Store/Adapter/StardogIntegrationTest.php:18

9) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testInstantiation
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

10) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testListTables
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

11) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testAddStatementWithUriObject
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

12) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testAddStatementsWithLiteralObject
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

13) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testBuildLiteralString
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

14) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testBuildTripleString
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

15) Erfurt_Store_Adapter_VirtuosoIntegrationTest::testImportRdfWithUrlAndRdfXml302After303GithubOntoWikiIssue101
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:154
/var/www/tests/integration/Erfurt/Store/Adapter/VirtuosoIntegrationTest.php:18

16) Erfurt_StoreIntegrationTest::testCheckSetupWithZendDb
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/unit/Erfurt/TestCase.php:118
/var/www/tests/integration/Erfurt/StoreIntegrationTest.php:111

17) Erfurt_StoreIntegrationTest::testSparqlQueryWithCountQueryAndEmptyResultIssue174
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/StoreIntegrationTest.php:140

18) Erfurt_StoreIntegrationTest::testSparqlQueryWithCountAndFromIssue174
Skipped since other backend is under test.

/var/www/tests/unit/Erfurt/TestCase.php:164
/var/www/tests/integration/Erfurt/StoreIntegrationTest.php:154

FAILURES!
Tests: 68, Assertions: 211, Errors: 3, Skipped: 18.
pfrischmuth commented 8 years ago
simeonackermann commented 8 years ago

The Psr7 method is from previous guzzle version an needs to removed. But even without I get an error like:

SPARQL Error: HTTP/1.1 406 Not Acceptable Content-Length: 0 with query: 
SELECT ?parent ?child FROM <http://example.org/> 
WHERE { ?child <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?parent. 
OPTIONAL {?child <http://ns.ontowiki.net/SysOnt/order> ?order} 
FILTER ( ?parent IN (<http://rdfs.org/sioc/types#Comment>) ) } ORDER BY ASC(?order) 
simeonackermann commented 8 years ago

Last commit fixes undefined function, but still having some problems with OntoWiki. After select a knowledge base a screens gets blank with an error: Error on bootstrapping application: Zend_Session::start() - /var/www/vendor/zendframework/zendframework1/library/Zend/Session.php(Line:482): Error #2 session_start(): Failed to decode session object. Session has been destroyed

pfrischmuth commented 8 years ago

With your latest fix I can run OntoWiki with Stardog backend without errors. Have you tried to run it with a fresh browser session?

When I try to execute a SPARQL query SELECT * FROM <http://ontowiki.local/dataset/> WHERE { ?s ?p ?o } on a model however, I get zero results. When I run the same query in the Stardog console directly I get two triples (rdf:type and rdfs:label). To reproduce this, just create an empty new knowldge base and add a label. When I test the same with Virtuoso, I get the two triples as expected.

simeonackermann commented 8 years ago

Beside the wrong Sparql results I still had the issue with the destroyed Zend session (tried a lot to fix it but doesnt found the problem). I only figured out that something with Guzzle caused my issue, so I replaced Guzzle with the Zend_Http_Client. Now everything works fine for me.

pfrischmuth commented 8 years ago
pfrischmuth commented 8 years ago

LGTM

k00ni commented 8 years ago

Thanks guys! :+1: