vkuznet / transfer2go

Distributed, loosely couple agent-based transferring system
MIT License
8 stars 2 forks source link

Switch to pull model #13

Closed vkuznet closed 7 years ago

vkuznet commented 7 years ago

Currently we implemented push approach:

The way how it's done now in my prototype is the following: a client sends request to an agent to transfer dataset /a/b/c from to site X. The agent first checks if it has this dataset, if so, it initiates the transfer by pushing data from itself to site X. If that agent does not have this dataset it broadcasts request to all known agents. The agent who has it replies and request is delegated to that agent. This agent then pushes the data from itself to site X.

It has some flaw, e.g. site can go down or experience maintenance or run out of disk space, therefore we need to explore, develop and eventually switch to pull model.

Sites today have complete control over the agent that puts data into their site. This is a design choice that was made in order to put the responsibility for transfers onto the site ops team. E.g. the site can turn off their agent when they have problems with storage. They can throttle it if there are issues. They can stop the agent if they loose disk and thus run out of space, or run out of space for some other reason. In pull model request will land to a site which request the data and fetch it from original site. From the above description we'll redirect request to agent sitting on site X and it will download dataset /a/b/c from whatever site holds its copy.

rishiloyola commented 7 years ago

screen shot 2017-04-12 at 1 33 15 am

Implementation Details - First, the client will select an agent which has the complete requested data set. After deciding an agent it will create the tranferRequest and will pass it to the request manager of site B on /manager endpoint. Request manager will store the request in the pool. The request manager will approve requests from the pool, if site conditions are good (no disk issue) and may disapprove request is site needs time to handle its own issues.

Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on /request endpoint. After getting the request from siteB that agent will push the data on upload endpoint.

The request manager will approve the request based on two parameters - Time and data size. If the site has enough storage capacity then only manager will approve the request.

vkuznet commented 7 years ago

Rishi, good but please outline where things can break :)

E.g. what if site admin of the site which suppose to push data will decide at that moment to shutdown the site.

I don't think we need such complexity for delegation.

Best, Valentin.

On 0, Rishi notifications@github.com wrote:

screen shot 2017-04-12 at 1 33 15 am First, the client will select the agent which has the complete requested data set. Then it will create the tranferRequest and will send it to the request manager of site B on /manager endpoint. Request manager will store the request in the pool. The request manager will approve requests from the pool, if site conditions are good (no disk issue) and may disapprove request is site needs

time to handle its own issues. Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on /request endpoint. Now that agent will get the transfer request and will push the data.

The request manager will approve the request based on two parameters - Time and data size. If the site has the enough storage capacity then only manager will approve the request.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/vkuznet/transfer2go/issues/13#issuecomment-293387152

rishiloyola commented 7 years ago

If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process.

vkuznet commented 7 years ago

Correct, but don't make 3 as a hard-coded number, it should be configurable.

On 0, Rishi notifications@github.com wrote:

If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/vkuznet/transfer2go/issues/13#issuecomment-305790229