Open madpah opened 2 days ago
Yeah this is a good discussion. So here's how I envision the workflow to look like:
POST
against /v1/product
to create a product with relevant data (such as sku, barcode etc)GET
to /v1/product
with the query parameter barcode=xyz
leaf_references
GET
against /leaf/<leaf reference>
go traverse furtherWhile this might not be an easy 'browseable' way to discover the artifacts, it would be relatively straight to do it programmatically. The key here is that we need to keep it universal. That's why I would suggest that the product-identifier
is a server-side generated UUID, and everything else is metadata. That way you can discover a product using multiple approaches (e.g. purl, sku etc).
For instance, the above workflow works equally well with say a purl
, where you just use purl=zyz
.
I think from the consumer side, they have a TEI and just want to reference that to find the product. Having the client understand different TEI types makes it too complex. Basically just treat the stuff after the host name as a string. The type is for syntax checking when needed and helpful.
The question is really where the TEI originates. In the current API spec it's a server-side generated UUID. In your vision, is the TEI provided by the user when the product is created?
Let's be clear: a TEI is the entire URN - urn:tei:cyclonedx.org:SHA256:fd44efd601f651c8865acf0dfeacb0df19a2b50ec69ead0262096fd2f67197b9
So the product-identifier
in question would be fd44efd601f651c8865acf0dfeacb0df19a2b50ec69ead0262096fd2f67197b9
For the GET /product
, from a Consumers perspective, they can easily supply either the full TEI or the product-identifier
.
Now, TEI Specification does define type
for the product-identifier
- SHA256
in the above example, so, it would be possible for the Consumer to specify the type
and the product-identifier
...
So IMO, we have 3 options:
GET /product?product-identifier=<product-identifier>
GET /product?type=<type>&product-identifier=<product-identifier>
GET /product?tei=<tei>
I honestly missed the document you're referring to @madpah. Good pointer.
Having said that, I'm really confused by the SHA256 here. Is this a SHA of a collection object (e.g. a given SBOM)?
Other than that, I don't have a strong opinion if we're using GET
parameters or paths (e.g. /product/?product-identifier=foo
or /product/foo
. I suppose going for a GET
parameter based approach does provide more flexibility.
I would prefer to use path segments instead of GET
parameters, so a TEA server might be implemented using a static website.
If you have only a few releases of an Open Source product, you might generate the data once and store it in a Git repo.
GET /product/<type>:<product-identifier>
sounds better to me.
Edit: Replaced <type>/<product-identifier>
with <type>:<product-identifier>
.
I like the idea of a static version. That's a good argument for paths to me.
Do you see any usage for the TYPE in the API? A single product can have many TEIs both of the same type and different types. I think we should handle them as opaque strings, just identifiers, in the API.
The question is really where the TEI originates. In the current API spec it's a server-side generated UUID. In your vision, is the TEI provided by the user when the product is created?
Remember that a product can have multiple TEIs. The UUID of the product item is an identifier that is used after TEI resolution. Don't mix the UUID tei with the TEI in the product object. It can be the same, but doesn't have to be.
The TEI is either created somewhere else and registred in the TEA database or created in the TEA service. I suspect it to be both. If I want to use my own article numbers in my ordering system, those IDs are created in that system. We likely want an API endpoint to manage the TEIs for a given product.
Do you see any usage for the TYPE in the API? A single product can have many TEIs both of the same type and different types. I think we should handle them as opaque strings, just identifiers, in the API.
I edited the comment to prepend the <product-identifier>
with the type.
The question is really where the TEI originates. In the current API spec it's a server-side generated UUID. In your vision, is the TEI provided by the user when the product is created?
Remember that a product can have multiple TEIs. The UUID of the product item is an identifier that is used after TEI resolution. Don't mix the UUID tei with the TEI in the product object. It can be the same, but doesn't have to be.
The TEI is either created somewhere else and registred in the TEA database or created in the TEA service. I suspect it to be both. If I want to use my own article numbers in my ordering system, those IDs are created in that system. We likely want an API endpoint to manage the TEIs for a given product.
It would certainly be possible to allow for the UUID to be provided by the client, as long as there are server-side checks to ensure it's unique.
My suggestion would be that the UUID is always generated server-side, and then you can attach metadata to this. For instance, you might want to have both a SKU and barcode, as well as a UUID. This is how the current API draft has been designed:
In the real world, I don't think the UUID will be exposed to end users very frequently. You'd probably want to use one of the many other supported, more use friendly ways to identify your product (sku, barcode, purl etc).
I think you are mixing the UUID for the product index object and the one used in the TEI. Those are two different entities (but can be the same). I don't want to limit a manufacturer to use UUID from another system in the TEI.
I still think the way you have added stuff like "barcode" etc is wrong and not extensible. It needs to be simplified.
Do you see any usage for the TYPE in the API? A single product can have many TEIs both of the same type and different types. I think we should handle them as opaque strings, just identifiers, in the API.
See comment above, but in the current implementation, the various TYPEs are implemented as metadata (e.g. you can query with /product?barcode=foobar
)
I still think that would lead to a massive API that always changes. We have to find another model for the metadata and not have it in the structure like that. Maybe key/value list if you persist in being able to query on PURL or other type values.
I think you are mixing the UUID for the product index object and the one used in the TEI. Those are two different entities (but can be the same). I don't want to limit a manufacturer to use UUID from another system in the TEI.
Sure, we can make this user provided as an option -- but it still needs to be enforced to be unique.
I still think that would lead to a massive API that always changes. We have to find another model for the metadata and not have it in the structure like that. Maybe key/value list if you persist in being able to query on PURL or other type values.
Aren't these keys derived from the tei-types anyways and thus are per-defined?
The UUID generated by the system for the product object needs to be enforced to be unique in the system, like the leaf, collection and other objects. But a single product can have multiple TEIs with different UUIDs. THat's another name space. We should not mix them.
If you create a field called "bar code" you are assuming exactly one bar code per product, which I think is wrong. And you have to keep defining new fields for every single type we define. I would like as much as possible to decouple defining TEI Types from the API. From the API point of view it's basically a string, an identifier without any other meaning.
If you want to be able to provide a lookup I suggest we create a key-value pair array with a key of "type" and then the "tei" as a value without parsing it further. The query would be ?type=hash&value=sha256:234234234234234234
and similar queries then. That would make the types transparent in the code, but still reachable for queries.
I just realized, we are talking about two different endpoints here:
/product/{tea_product_identifier}
: this one should probably accept the whole TEI URN as {tea_product_identifier}
and it returns a single product,/product
: since this endpoint returns the TEA Index, query parameters should probably be used. We can propose some query parameters to refine the query, but we shouldn't require TEA server implementers to support any parameter. Worst case scenario you get the entire index, which is perfectly fine for minimalistic TEA servers with a couple of products.We have to separate a query for the product object UUID
from a TEI of type UUID with a specific value. Those are two different things.
/product/<tea_product_identifier>
asks for the product with a specific UUID in the system/product/?type=UUID&value="UUIDv3:5df41881-3aed-3515-88a7-2f4a814cf09e
queries about a TEI of type UUIDIf you create a field called "bar code" you are assuming exactly one bar code per product, which I think is wrong.
That's easily rectified by turning it into a dict. I have no to that.
And you have to keep defining new fields for every single type we define. I would like as much as possible to decouple defining TEI Types from the API. From the API point of view it's basically a string, an identifier without any other meaning.
Yes, but isn't that a fairly standard way to do it? Just like CycloneDX, each version of TEA would have a specification that the API version would implement. You'd just have to version the API to align with the TEA standard.
If you want to be able to provide a lookup I suggest we create a key-value pair array with a key of "type" and then the "tei" as a value without parsing it further. The query would be ?type=hash&value=sha256:234234234234234234 and similar queries then. That would make the types transparent in the code, but still reachable for queries.
This doesn't really align with modern RESTful API design though and to me is a less clean implementation.
We could go down the whole GraphQL route, but that would require a pretty big overhaul.
How would a query based on type and value look into modern API? There are certainly solutions to query for key/value pairs that would work here too.
We can't release a new ECMA TEA version for each type that we add. That's not doable. We really need to try to decouple TEI types from API. PURL is struggling with this as well and have decoupled PURL Core from PURL types for the same reason, not having to upgrade PURL core for very new type invented.
From https://github.com/CycloneDX/transparency-exchange-api/pull/77/files#r1850138397
@madpah:
@vpetersson:
@madpah: