Closed obo20 closed 3 years ago
This is intentional – see explainers:
@lidel Is there a reason we need to be persisting the "source" providers that the user passed in as part of that pin object's model?
I was under the assumption that users would simply pass these in, and the pinning service would use them to fetch the data. It makes sense to provide back the multiaddresses of the nodes that the pinning service has pinned the content to, as the user will want to reference them when retrieving the data again in the future.
However, I'm not seeing a value in making the source multiaddresses part of the object's returned model as once the content has been retrieved they serve no further use.
@obo20 well, in the current spec we persist entire Pin
object and providers
are part of it (can be empty tho).
Not doing so complicates the spec (how to tell which fields sent in POST /pins
are persisted and which ones are ephemeral?).
Use cases:
PinStatus.status = failed
.Pin.providers
is useful for retrying (knowing what failed, and submitting alternatives via POST /pins/{id}
@lidel would you be open to having unique names for the two different providers arrays?
If I as an implementer of the spec had confusion between the two, I imagine there might be a lot of confusion from those looking to work with the spec.
Perhaps PinStatus.providers
should become PinStatus.delegates
? We already use the delegate concept in IPFS, so this could be a clearer naming scheme.
would you be open to having unique names for the two different providers arrays?
Yes, I think better naming could make things more intuitive, and clarify the difference between "optional source multiaddrs" and "mandatory ones delegated by pinning service". Ideally, I would remove "provider" term from both fields, because those would not be the only providers, just a subset.
Naming things is hard.. how about optional Pin.sources
(origins
?) and mandatory PinStatus.delegates
? (pinners
?)
how about optional Pin.sources (origins?) and mandatory PinStatus.delegates? (pinners?)
I think Pin.providers
makes sense to leave since these are "known" providers, sources makes it more confusing to me. But yes to the rest.
I like origins
it signifies to me where the content originally came from. I would vote for Pin.originNodes
or Pin.origins
for the source nodes. I'm less sure what to name the nodes the pinning service is running / pinning to.
Ok, two weeks later we don't have any movement here, nor alternatives other than ones already proposed in this issue.
Let's make a quick temperature check before making the final PR:
Pin.providers
? https://www.strawpoll.me/20752027PinStatus.providers
? https://www.strawpoll.me/20752034Depending on the results will open PR with the change on Thursday (or not).
Update: results: https://github.com/ipfs/pinning-services-api-spec/pull/50#issue-467397320
@lidel one thing I want to clarify here as well. With the provided meta
attribute. Is this purely intended to be utilized for custom key value attributes that users provide? Or will there ever be "custom" functionality that pinning service providers can use that field for.
I ask because I noticed in the PinStatus meta
description it describes that attribute as being used for custom service provider info. I want to make sure we're not expecting the same of the values users provide for the meta
attribute when pinning content.
If we want to allow custom pinning service provider functionality, I would like to suggest we keep the meta
parameter as purely key values and we also create an options
parameter when uploading for the custom functionality. (an example of this custom functionality would be our users providing a custom pinning policy
during uploads that determines what regions their content is stored in.
Is this purely intended to be utilized for custom key value attributes that users provide? Or will there ever be "custom" functionality that pinning service providers can use that field for.
@obo20 kinda both!
Pin.meta
is provided by the user and is persisted as-is with the pin object.
Pinning service might provide custom functionality based on values in Pin.meta
, and we want those values to be persisted because user may want to inspect it later, or even change it via POST /pins/{id}
For example: I create pin with Pin.meta[regions] = [US]
and pinning service assigns my pin to specific region based on that metadata. Some time later I want to keep pinned data in both US and EU, so I send modify request to POST /pins/{id}
with updated pin object that contains Pin.meta[regions] = [US, EU]
, and pinning service takes care of mirroring the data between regions.
@lidel My biggest concern here is that we're trying to do too much in one place.
While I appreciate wanting to be flexible with the spec, having users provide both "metadata" and "actionable data" in the same place with no standards seems dangerous to me.
For example, let's pretend that Pinata didn't have multiple regions functionality from the start and User A stored all of their cids with Pin.meta[regions] = [US, EU]
because that's where they're storing content on their own nodes.
If we (or any other pinning service) ever added in regions functionality and decided to use the same format as User A for our instructions, all of a sudden the metadata that User A is passing in is going to get analyzed and interpreted as an instruction to pin to multiple Pinata regions while User A may not be intending for that to happen. They may have just wanted those keyvalues to be used for classifying / querying their data internally.
@obo20 I am not sold on the idea we need an overhead of a separate field for vendor-specific actionable metadata.
The scenario you described could be avoided by pinata using pinata_
prefix in actionable key name (pinata_region
), effectively removing risk of accidental collisions and is easier to reason about than the need to understand difference between meta[foo]
and options[foo]
@lidel do you have any pre-existing literature you're referencing for this design choice?
We modeled ours after Stripe, which is purely key/values used for querying, but nothing actionable https://stripe.com/docs/api/metadata
From reading through the AWS s3 metadata functionality it appears there's a few "System-defined object metadata" key / values that can have outside effects on how things work. The catch here is that these are all strictly controlled as amazon is the only entity working in this ecosystem and thus they have full control. https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata
@obo20 the only reason for this design choice was to keep API surface as small as possible
while enabling users and vendors to come up with useful conventions on top of a simple-but-flexible meta
.
My worry is that if we have separate Pin.metadata
and Pin.options
, and both are optional maps persisted along with pin object, what will keep people from misusing Pin.options
?
People will have to read docs to understand the purpose of each, and there always be someone who just use whichever they want because there is no validation. We could add distinct Pin.options
if it had some kind of validation, but that requires us to hardcode specific flags, such as region
, in the spec itself. I think it's too early for solidifying that type of thing, especially before Filecoin launch.
I don't believe the risk of "newly added actionable key clashing with user-defined one" is worth complicating the spec at this stage, especially when vendor_
prefix to actionable keys solves it.
Going back to your original question from https://github.com/ipfs/pinning-services-api-spec/issues/41#issue-668082164, do we want to disambiguate PinStatus.meta
and Pin.meta
?
Or is solving *.providers
in #50 enough?
@lidel My thoughts were that the options
object wouldn't be persisted. Basically I would like to have a way for users to pass in instructions that don't need to be persisted as key/values.
While our database is set up to store metadata as key / value pairs for simple querying purposes, unfortunately additional features such as regions functionality are often too complex to fit into a simple key / value paradigm and attempting to persist them as such would be problematic. For reference, here's an example "pin_policy" that a user would pass in:
customPinPolicy: {
regions: [
{
id: 'FRA1',
desiredReplicationCount: 1
},
{
id: 'NYC1',
desiredReplicationCount: 2
}
]
}
We couldn't fit something like that into one key/value as we're storing the values as string (Which I agree with. Relational databases are not very efficient when querying JSON sub-objects)
This leaves us with a few options:
vendor_
prefix (which seems against what you're wanting for the meta
object)vendor_
prefix in our database in the same way we do the rest of the metadata in addition to acting on it instructionally. A big problem with this approach is querying the data. It becomes really hard/impossible to allow for more nuanced queries when things are simply stored as a string in a key/value format. Neither of these seem like particularly elegant solutions which is why I was proposing an options
object (could be named something else) that doesn't need to be persisted.
To answer your other question about the meta
duplication, I think we might want to name this something else. From what I'm understanding, the Pin.meta
should always be something the user sets, while the PinStatus.meta
can be arbitrary information returned from the Pinning service? It might be good to make that distinction somehow. The best I can come up with is something like PinStatus.info
, so I'm open to suggestions here.
Neither of these seem like particularly elegant solutions which is why I was proposing an options object (could be named something else) that doesn't need to be persisted.
I've spent some time thinking about it, and ended up with opinion that the idea that some fields won't be persisted produces undesired side effects. It makes it harder for client to inspect a pin via generic API to tell if/which any vendor-specific settings were applied to a pin. In practice, it would mean that each vendor has to provide a separate, vendor-specific API for those custom features, which defeat the purpose of generic API a bit.
We couldn't fit something like that into one key/value as we're storing the values as string
When it comes to vendor-specific features, I'd like us to go with text-based notation for now. Not only because of implementation simplicity, also security, as string-based key-value map removes surface for potential object serialization abuse.
How about something compact:
Pin.meta[pinata_policy] = "FRA1:1,NYC1:2"
or more verbose:
Pin.meta[pinata_policy_FRA1] = "replication:1"
Pin.meta[pinata_policy_NYC1] = "replication:2"
?
[..] unfortunately additional features such as regions functionality are often too complex to fit into a simple key / value paradigm and attempting to persist them as such would be problematic. [..] A big problem with this approach is querying the data. It becomes really hard/impossible to allow for more nuanced queries when things are simply stored as a string in a key/value format.
I feel like I am missing full context here, but the way I see it, pinning service receives metadata in Pin.meta
as flat map of strings. You are free to persist it in any way you want, store actionable keys in separate tables, if that makes sense performance-wise etc. Pin
object is effectively immutable (modification operation is only an alias for delete+create), so there is no need for synchronization, all you need to do is to ensure records are removed on Pin
deletion.
I like the idea of PinStatus.info
, to avoid bikeshed I've opened https://github.com/ipfs/pinning-services-api-spec/pull/51 :-)
Some notes from the call me and @obo20 (thanks!) had yesterday:
options
, and Pin.meta
will remain as-isPin.meta[vendor_foobar]
convention for now. In the future we may have multiple vendors proposing mutually agreed change to this spec (if it proves to be not enough), but it is too early to tell atm.Pin.meta[vendor_foobar]
convention, but it is not a priority nor a blocker for MVP in WebUI.
Currently the![image](https://user-images.githubusercontent.com/7811297/88843518-df56b780-d1a6-11ea-9fd0-f63fb170e376.png)
GET/{CID}
endpoint returns an object that looks like this:The response returns a
pin
object that has values(providers
andmeta
) that are duplicated in the root return object as well.Is this intentional or an accident?