Closed vasco-santos closed 11 months ago
@Gozala I added the intermediary queues resources as we discussed yesterday. Looking at it now, I think we are adding a lot of extra complexity that we could likely not have.
filecoin/add
(with: did:key:zSpace
) can return the receipt with out { status: queued }
with effect pointing to filecoin/add
(with: did:web:web3.storage
). As we were doing in previous version, we create delegation and stick in first receipt the CID. Then when we dequeue, we create invocation and write receipt for it with result of queue consumer.
I think having the queue abstraction brings more complexity and in practise we do not get much from it. We only have 2 results to return, first is queued
and second is accepted
or error
. If we would have the 3 invocations for such flow, we end up with
filecoin/add
(with: did:key:zSpace
), puts requested operation queued to be executed, creates delegation filecoin/add
(with: did:key:zFilecoinQueue
) and writes receipt with delegation as effect together with out { status: queued }
.filecoin/add
(with: did:web:web3.storage
)The second receipt does not even has an output, and seems to introduce more complexity then real benefit. I know the idea would be to perform verification only when did is queue, but creating this exception for this queue does not seem so critical at spec level. This logic applies to all other queues. In short, I think we should drop this abstraction. It could work if we wanted to illustrate things getting out and into the queue again, but that will increase a lot number of receipts.
Let me know if you agree we should remove that intermediary step
@Gozala thanks, for the feedback ❤️ I already adopted majority of it, leaving the rest for needed extra discussion
Looks like you see an implementation detail in different angle than me.
Storefront
or even Agent itself, could perform a filecoin/add
directly to a SP, leaving the in between pieces out of the equation without extra changes.Aggregator
to FilecoinEngine
(or FilecoinAgent
)agent
to Storefront
from filecoin/add
to piece/add
- They have a Filecoin piece, and want to add itStorefront
to FilecoinEngine
from aggregate/add
to filecoin/add
- Storefront just cares about things landing into Filecoin with SLAs agreed with FilecoinEngine (via group)This one is appealing option to not need extra invocation field. We can do it, if you feel strongly about it. Let me leave some reasoning that made me go that way that would be a drawback of moving into that direction:
tenantId
for the deal, which in my head is the storefront DID (or a mapping of it that spade-proxy
can do). By merging them together we would need to have knowledge of splitting that string on the other end of the pipeline, which does not seem so desirable to me.group
as actually the only thing we need to derive the grouping ID for queue messages. If it is a DID that would be fine, if we intend to use it as something like free
only we definitely need to merge things.Looks like you see an implementation detail in different angle than me.
I think you may have misunderstood some of what I was getting at. Let my try it differently, I think aggregator is kind of proxy like an attorney. It acts on behalf of the storefront / broker tenant. It becomes even more clear when you consider signing pipeline, when aggregator submits a deal to the broker eventually broker will call the storefront and not broker to sign it, which is why I think it is crucial that relationship is between storefront <-> broker and there is no
flowchart LR
storefront --> aggregator
aggregator --> broker
broker --> storefront
What I mean in practice is that storefront registers with a broker (shares it's DID basically) as opposed to both storefront and aggregator registering with a broker.
You are correct that it would imply bit more configuring of aggregator, namely storefront would need to delegate { can: "deal/*", with: "did:web:web3.storage" }
to the aggregator, which can simply be a env variable. I know it feels like an extra hoop, but I think we'll have to get better at doing this somehow as we keep seeing this pattern over and over and can't keep working around it forever.
Also note that we can even avoid having to provision aggregator with delegation and instead include it the proofs of every piece/add
delegation that way aggregator will not require provisioning.
I do not think folding aggregator is better option, it just tangles more things while I'd like to further detangle them.
- Dealer requires tenantId for the deal, which in my head is the storefront DID (or a mapping of it that spade-proxy can do). By merging them together we would need to have knowledge of splitting that string on the other end of the pipeline, which does not seem so desirable to me.
This is indeed a drawback. Maybe we can meet in the middle ? I basically don't want a storefront to set that field because it can impersonate (intentionally or due to bug) another tenant, unless we check that it is the same as with
in which case why would you even set the field.
That said we could instead have remove (ignore) tenant
in invocation from storefront and then copy with
when propagating that request. Only drawback with this was here that we basically overload capability by a with
field, which tends to be symptom of the bad design. Perhaps idea of using same can field for both enqueue and dequeue is flawed and we should simply create a separate capability.
I don't favor strongly either of the three options here, so I think you should decide. If you do keep tenant
field, I would encourage to make it clear that for invocation with storefront is obsolete.
- I was seeing the group as actually the only thing we need to derive the grouping ID for queue messages. If it is a DID that would be fine, if we intend to use it as something like free only we definitely need to merge things.
My point is that could lead to say web3.storage pieces going through nft.storage deal pipeline because uniqueness across uncoordinated storefronts can not be achieved. Which is why I suggest to achieve it effectively by ${tenant}/${group}
that way colliding group names across tenants would not lead to invalid behavior.
metrics are easier to compute to calculate numbers of each group/storefront if things don't mutate
I'm not sure I understand this point. What I was suggesting is to make tenant
implicitly equal to with
field on invocations from storefront.
After writing all this I'm starting to be slightly in favor of just having a separate capabilities as only drawback with them is having to come up with naming. In fact even the handlers in your implementation would be more cleaner as they won't have two separate code paths.
I think the best solution to align without getting into complexity of delegation is to just rename the aggregator and the invocations (again…).
I do not think renaming would address my concern, which is broker/dealer been aware of another party. Triangular relationship is usually introduces coordination complexity
flowchart LR
storefront --> aggregator
aggregator --> broker
broker --> storefront
In my experience it's always best to have linear relationship which is either through routing
flowchart LR
storefront <--> aggregator
aggregator <--> broker
or encapsulation
flowchart LR
storefront <-- aggregator --> broker
@Gozala regarding point 2, I think we will need to revert to having 2 capabilities again. I can do it tomorrow, I can suggest we call them x/add and x/queue where /queue has the with of the issuer that we can set as storefront in nb of /queue that is self issued by service
@Gozala regarding point 1, there are still some details that I do not understand where you stand. I understand the benefits of the solution you suggest here. And there are definitely some. However, I feel there are still some gaps.
Considering storefront
delegates to aggregator
capability to add aggregate to dealer
,
storefront
to also delegate capability to check chain-tracker
or things start to get inconsistent where there are other "three actors jumps". This feels that things will just get more and more complicated for each Storefront.aggregator
(or better filecoin-engine
) to be a service that storefront uses to put pieces into filecoin under a SLA, it would mean filecoin-engine
would to for storefront
what storefront
does to users of web3.storage. Facilitate making things available in SPs. In other words, storefront does not care on all the details of aggregation, SPs matching and so on, just about making things at some point available in SPs. By delegating, storefront
is now aware of all the details behind putting something in SPs (which will change in the future).filecoin-engine
as a service that will guarantee things make it to Filecoin with some SLAs for their customers (storefronts).Also not delegating here, does not stricly mean that storefront can't delegate to sign, which in my view makes sense here. This is actually another reason that Aggregator is probably not the best naming.
@Gozala regarding point 1, there are still some details that I do not understand where you stand. I understand the benefits of the solution you suggest here. And there are definitely some. However, I feel there are still some gaps.
I'll use name "agency" which is I got from Riba in the past to refer to currently "aggregator" and perhaps "filecoin-engine" in the future, because I think it conveys the mental model better.
Considering
storefront
delegates toaggregator
capability to add aggregate todealer
,
- who is the actor responsible for keeping track of deal state in the end?
Agency takes pieces and is responsible for getting it into filecoin deal(s). This implies that it receives pieces from Storefront and either reports a problem (if it is faulty) or reports success providing inclusion proof and deal info.
All of the state management including retries is taken care of by Agency.
Agency may expose intermediate states through chained effect receipts, however how much of it is exposed vs kept off receipt chain is up to what would make most sense from operational standpoint. Exposing more would improve system introspection at the cost of an overhead, so we'll have to find a right balance between two and possibly adjust it over time.
For Agency to perform this task it will need signed permission from the Storefront client (Similar to how attorney can represent a client after they signed a contract). In our case I think that boils down to
With above capabilities Agency should be able to take care of getting a piece(s) into the filecoin. Once that's done it's job is complete and takes no responsibility for tracking deal renewals or reporting them. At this point storefront can hire "Chain Tracker" to get updates on deal updates (push model) or alternatively use "Chain Tracker" to query aggregate deal state (pull model).
- If we delegate, we either need to have
storefront
to also delegate capability to checkchain-tracker
or things start to get inconsistent where there are other "three actors jumps". This feels that things will just get more and more complicated for each Storefront.
I'm not sure sure if above makes it bit more clear on this side. In my head how tracker works is not entirely clear yet. It is also not clear to me if querying tracker strictly requires permissions given that it is essentially an index over the public chain. Still if Agency needs to interact with tracker that does not seem problematic to me, it can interact with various actors to get it's job done.
Real question here seems to be should Agency query the tracker on behalf of the Storefront or should it query it on it's own behalf. I think either option is fine here because unlike the case with Dealer it is not important to capture that operation occurs on behalf of the Storefront.
Yet if we decide that such query should happen on behalf of the Storefront that should not be a big deal either, Storefront would just need to add another capability to the above set.
- if we do not delegate, and see the
aggregator
(or betterfilecoin-engine
) to be a service that storefront uses to put pieces into filecoin under a SLA, it would meanfilecoin-engine
would to forstorefront
whatstorefront
does to users of web3.storage. Facilitate making things available in SPs. In other words, storefront does not care on all the details of aggregation, SPs matching and so on, just about making things at some point available in SPs.
I do not understand why you think delegation here changes this. I also see the same picture it just fact of registering Storefront with Agency is signing a delegation (signing a contract with an agency).
By delegating,
storefront
is now aware of all the details behind putting something in SPs (which will change in the future).
You are correct to point out that Storefront becomes aware that Agency will interact with a Dealer on their behalf. It is a tradeoff in comparison to black box model where Storefront knows nothing about what Agency does to get things into Filecoin. Yet, given that Sterofront still needs to sign deals and make pieces readable it is impossible to achieve black box model, which is why I prefer transparent model over one that is neither.
I also want to point out that we're defining protocol with capabilities so even if in the future we change Agency pipeline so it no longer needs to work with Dealers we could still use same protocol meaning hand set of signed capabilities to the Dealer to do their job.
- who keeps all the persistent state of aggregates and inclusion?
I think we have covered this to some degree. I imagine that invocation (by Storefront) that adds a piece either has an effect that succeeds once piece makes into a filecoin deal, providing details about inclusion proof etc.. or fails if problem is discovered with a piece.
Agency may retry (several times) same piece if aggregate failed due to errors not caused by it but thous details do not need to be captured in the invocation trace.
If is a delegation of capability it also feels weird for me that then is not the storefront who keeps all the state (and therefore responsible for keeping the aggregate and inclusion mappings)
- who is going to trigger alerts?
will everything handle in the hands of the storefront? I see
filecoin-engine
as a service that will guarantee things make it to Filecoin with some SLAs for their customers (storefronts).
I do too
Also not delegating here, does not stricly mean that storefront can't delegate to sign, which in my view makes sense here. This is actually another reason that Aggregator is probably not the best naming.
đź‘Ť
Based on my previous proposal to @Gozala that @Gozala covered in https://github.com/web3-storage/RFC/pull/2
Changes the aggregation spec introducing new capabilities and a new role. Main reasoning for this change is the new distinction between Storefront and Aggregator (previously had same identity), where a Storefront (web3.storage, nftstorage) will receive/compute Filecoin pieces (from received CAR files) and offer them to an Aggregator (w3filecoin). The aggregator's role is to put multiple pieces together to offer a large piece (aggregate) to a Broker (spade-proxy).
Per the above, the new invocations added are described in https://www.notion.so/2023-07-26-aggregation-protocol-contract-sign-and-report-API-12b4a7f0e92f4b789c331322e2629817
Renamed spec from aggregation to w3 filecoin given the context increased from aggregation to accommodate new requirements
Note that
list
, will be added later on