RockefellerArchiveCenter / pyfc4

Python client for Fedora Commons 4
MIT License
7 stars 0 forks source link

new namespaces on resource creation #73

Closed ghukill closed 7 years ago

ghukill commented 7 years ago

New namespaces to the repository are only correctly added when going through resource.update().

For example, if a new resource is created, and a triple is added before create that contains a namespace new to the repository, this namespace is not added correctly.

This happens if:

But it does not happen if the resource has been created and update is run, even if adding namespace. This would seem to suggest it's related to new namespaces in .create, which does not correctly pass along this new namespace.

The only successful way to introduce a new namespace appears to be through a PATCH request, as is done with resource.update()?

ghukill commented 7 years ago

More fuel for the fire: it's not tenable to add triples to a resource before creation, so this might all be moot.

Tried instantiating a resource, adding a triple, then creating without specifying a URI (thereby issuing a POST request), which failed with the following error:

ERROR 22:45:43.016 (PersistingRdfStreamConsumer) http://localhost:8080/rest/ is not in the topic of this RDF, which is http://localhost:8080/rest/a8/9f/a1/19/a89fa119-0b79-4ad5-aaab-6c827ee53296.

Which is now unsurprising, as this resource was attempting to write triples for a URI it did not yet know. So, perhaps it makes sense to restrict add_triple unless resource.exists is True? This is set during creation and retrieval. Downside to this would be creating resources with auto_refresh off, as it would not yet know it exists....

Perhaps work to catch this error and inform, but not prevent?

ghukill commented 7 years ago

Might be a little trickier, as DirectContainer and IndirectContainer both create triples on creation, outside of controlled channels like resource.update.

For example, for an empty repository without namespace prefixes pcdm or ore, creating Direct or Indirect containers that automatically created triples with those prefixes -- if those prefixes have not been created through update -- results in bad prefixes.

Not quite "bad" in the same way, these are not ns001, but instead use the entire URI like http://www.openarchives.org/ore/terms/.

ghukill commented 7 years ago

One two-prong approach might work:

  1. when a repository instance is started, it checks for prefixes it has as part of repo.context, and issues an update to the root node to propagate those namespaces throughout the repository
  2. prevent add_triple from working before resource creation, or inform as mentioned above

This would add those namespaces to the repository before any Direct or Indirect containers could do so automatically, and would also prevent resources that had not been created (which we've determined is not a good idea) from adding triples.

ghukill commented 7 years ago

https://groups.google.com/forum/#!topic/fedora-community/Q38vsSiyyu8

ghukill commented 7 years ago

See forum post above - fixed in Fedora 4.8. Continuing to use 4.7 for testing, as this behavior has not yet affected tests, but will update docker image for travis if need be, and certainly when 4.8 is released.