fcrepo4-labs / fcrepo-api-x

Fedora API Extension Framework
Apache License 2.0
10 stars 11 forks source link

Indexing Skolems (without Service bindings)? #105

Open christopher-johnson opened 7 years ago

christopher-johnson commented 7 years ago

I have just started looking into API-X, so this may have a simple solution. My implementation needs to have the Skolem IRIs indexed, because I am using them to create lists.

I do not require them to have service bindings, because they are just glue to build JSON-LD @lists that are subsequently constructed from a triplestore query.

I assume that the ontology for API-X is either only identifying "http://fedora.info/definitions/v4/repository#Resource" or filtering all other objects because "http://fedora.info/definitions/v4/repository#Skolem" is definitely not indexed.

EDIT: Alternative work around that I explored yesterday is possibly to implement the toolbox indexer and the apix-indexer separately (using different db graphs). I built a docker image for the toolbox indexer yesterday, based mostly on your image that was quite a puzzle. The sharing dependency is that they would need to use the same Fuseki instance. I had to put Fuseki on 3030 both internally and externally, and your demo has in on 8080 internally ... Of course, this is a small detail that applies only to the demo.

I really want to use API-X in the framework, so I will continue to work on figuring a way to do this.

emetsger commented 7 years ago

Hi @christopher-johnson,

You may know that the API-X demo includes a minimal index for service discovery at http://**localhost**:3030/fuseki/service-index (if using docker-machine you'll need to substitute in the IP of your docker machine). The index is included in the demo for a heretofore un-published exercise (we are planning to create a second demo this quarter which would include additional exercises around this index), but the index isn't used by API-X itself. In fact, API-X doesn't rely on any external indexes to function.

So for those reasons, the demo doesn't include a full index of what is in the repository.

EDIT: Alternative work around that I explored yesterday is possibly to implement the toolbox indexer and the apix-indexer separately (using different db graphs).

You and I were probably trying to do the same thing yesterday :)

My suggestion would be to integrate the fcrepo-indexing-triplestore into the indexing docker container, and update the fuseki container with an additional dataset. As you mention, this is quite a puzzle to the un-initiated, and to the initiated as well ;)

Unfortunately, this approach will ultimately probably not give you what you want, because I don't believe that fcrepo-indexing-triplestore indexes Skolems.

Ultimately, I think you would have to modify fcrepo-indexing-triplestore or otherwise write your own indexer in order to have Skolems indexed in the triplestore.

If it would help you to have an updated fuseki and indexing docker containers with fcrepo-indexing-triplestore working out of the box, I can probably tweak what I have and push updated docker images. That would just require you to implement the actual indexing code, and install it into the indexing container.

christopher-johnson commented 7 years ago

@emetsger thanks for the insight. I have been digging in to API-X a bit more, and it is more clear how the service loader uses OWL rules to bind to classes. So, I assume that if there is no service that is binding to the #Skolem class then it would not be API-X scoped.

I have noticed that the UpdateListener is only processing messages that are typed #Resource. I suspect this ultimately controls how API-X services are notified and what messages they can consume.

fcrepo-indexing-triplestore does actually index Skolems, because it depends on the fcrepo-service-activemq service directly. I think that it just gets everything that is in the broker:topic:fedora input stream from the embedded ActiveMQ.

One thing that I still would like to do is to implement a binary serialization service, and I want to do this with API-X. There should be no conflicting issue I guess(still need to test it) if fcrepo-indexing-triplestore also gets API-X services update messages, because "more is more" (as if often said about triplestores). My main concern is that I am able to query also for unbound classes so I need these messages to arrive at the indexing service unfiltered,

BTW, thanks for your excellent work. I have tested a forked implementation of the fcrepo, karaf, fuseki, fcrepo-indexing-triplestore, apix composition, without issues. I will use it for a Fedora / IIIF integration demo (for the Pandora Framework workshop in Bern on the 13th ...)

christopher-johnson commented 7 years ago

FYI: I have dug into this some more and there seems to be some implementation confusion on whether or not FEDORA_SKOLEM events are (intentionally?) filtered by default. They appear in a topic but not in a queue...

I have filed an issue here: https://jira.duraspace.org/browse/FCREPO-2405