application-research / autoretrieve

A server to make GraphSync data accessible on IPFS
22 stars 7 forks source link

Autoretrieve Publish API #96

Closed hannahhoward closed 2 years ago

hannahhoward commented 2 years ago

What

In addition to metrics, deployments of autoretrieve may want to record and track more specific data about each retrieval, for more detailed querying or diagnostics.

With this in mind, I'd like to propose that an autoretrieve deployment can be setup with a "publish URL". The URL must point to a service that runs a REST API that auto retrieve interacts with to publish stats about its retrievals.

the proposed resources are:

RetrievalAttempt
{
   id: UUID
   cid: CID
   stage: string
   errorMessage: string
   autoretrieveInstance: string
   logs: []string
   startedAt: datetime
}

Endpoint:

PUT /retrieval_attempt/~uuid~
JSON Body: RetrievalAttempt
Success: 200 OK
Fail: 400 Bad Request

---

ProviderRetrieval
{
   peerID: peerID
   retrievalUUID: UUID
   stage: string
   errorMessage: string
   logs: []string
   startedAt: datetime
}

Endpoints:

PUT  /retrieval_attempts/~uuid~/providers/~peerID
JSON Body: ProviderQueryAsk
Success: 200 OK
Fail: 400 Bad Request

Tracking in auto retrieve:

type internalRetrievalAttempt struct {
   RetrievalAttempt
   lk sync.Mutex
   providers map[peer.ID]*ProviderRetrieval
}

// existing filecoin Retriever struct
type Retriever struct {
   // ... existing fields
   retrievalStatesLk sync.RWMutex
   retrievalStates map[UUID]*internalRetrievalAttempt
}
willscott commented 2 years ago

notes from sync:

hannahhoward commented 2 years ago

updated based on feedback in external meeting