Switch to URL path-based identifiers.

fcrepo / fcrepo-specification-atomic-operations

Fedora Specification of Batch Atomic Operations

Apache License 2.0

2 stars 4 forks source link

Switch to URL path-based identifiers. #8

Open peichman-umd opened 5 years ago

peichman-umd commented 5 years ago

This is a very rough first draft of switching from the header-based transaction identifiers to scoping transactions via URL paths. This approach is similar to the way the Fedora 4 and 5 Java Modeshape-based implementations work.

bbpennel commented 5 years ago

Using a path segment to identify the transaction the client wants to interact with potentially places a lot more burden on the client. It needs to understand the the URL it creates is not the permanent URL of the resource, and either the client or fedora needs to be able to resolve RDF references to resources within a transaction back to their non-transaction URL at commit time.

I'm still thinking through the value of path based transactions, but I don't think it requires path based transaction identifiers. A particular transaction id could still only apply to a particular path within fedora as a header. My experience with the fedora 4/5 transaction model and cleaning up identifiers in transactions makes me inclined towards a header approach.

peichman-umd commented 5 years ago

@bbpennel I definitely want to omit the transaction identifier from the URIs in any graphs returned while in a transaction. I agree that on the client side wrangling with those dual URIs is a pain (see for instance what we do in our batch loading client: https://github.com/umd-lib/plastron/blob/2.2.0/plastron/pcdm.py#L201-L215)

I'll take a crack at moving the transaction identifier into the headers.

peichman-umd commented 5 years ago

@bbpennel I've updated to remove the txn ID from the resource URIs. I'm retaining a transaction URI as the Atomic-ID header and the URI for manipulating the transaction itself.

whikloj commented 5 years ago

Just wondering about this:

If the client sending the current request is not authorized to access the transaction identified by the Atomic-ID header, the server MUST NOT update any resources, and MUST respond with a 404 Not Found HTTP status.

What does it mean to be not authorized to access a transaction?

peichman-umd commented 5 years ago

@whikloj I think that is a holdover in my mind from Fedora 4, where (I believe) there was a notion that only the client who initiated a transaction could update it. I'm fine with striking that, and going to a model of using the transaction URI as the only "credential" needed to access the transaction.

If so, I might add some guidance that transaction URIs should not be easily guessable...

whikloj commented 5 years ago

@peichman-umd I'd agree with not having a separate authorization mechanism for transactions. You should have WebACLs on your objects so that people can't edit them regardless of being in or outside a transaction.