Open richardrodgers opened 7 years ago
I'm not sure I understand the server part in the middle. Is it continually subscribing/unsubscribing from the transient queue until it reads the package completed message from the packager?
Not continually - it will only subscribe/read the queue when it receives a request for the package from the original client of the API (i.e the miner who requested the dump). If the client never checks back in, the transient queue is never read by the server, and it gets 'gc'd' eventually. I'm trying to avoid a lot of background tasks, pollng etc.
By the way, the example includes data like packager time estimates that I am not expecting we will do initially (not sure how we'd calculate it) - just suggests the idea of info that could be passed back to the client (other examples would be % complete, etc)
Okay, yeah, makes sense. We could probably make a reasonable estimate just based on the number of items in the docset.
Proposal for connecting the API server with the back-end worker that creates docset packages/archives. Goals include not having state maintained by multiple systems that require synchronization. Software components will hereafter be called 'server' (API server) and 'packager'
All workflow communication between server and packager will occur over message queues (packager can and will call server via its standard API for other information) - no new API endpoints will be defined. There will be a mix of permanent and transient queues to manage the communication. The server will write to permanent queues and read from transient queues, the packager the reverse.
Here is a sample flow (using STOMPish messages):
the /queue/package/5 is transient, and may be removed by configuration of the queue (auto-deleted after a time), or other means. Another possible permanent queue would be /queue/docsets/delete where the server could initiate a deletion (for storage conservation) of packages known to have been delivered, but this is TBD (we may want to allow multiple downloads)