dmwm / PHEDEX

CMS data-placement suite
8 stars 18 forks source link

FilePump can mark as 'expired' transfers that are still in progress #882

Open ericvaandering opened 10 years ago

ericvaandering commented 10 years ago

Original Savannah ticket 93621 reported by None on Thu Apr 12 05:30:50 2012.

Hi,

follow up from this incident report:

https://savannah.cern.ch/support/?127757

The FilePump agent doesn't check if a transfer task is already in progress (e.g. submitted to FTS) when marking the task as expired. This is most likely for long-running transfers (large files, or waiting for a long time in the FTS queue). If the transfer is immediately rerouted by FileRouter, the FileDownload agent of the destination site can pick up the new task and try to execute it while the previous transfer of the same file is still in progress. Normally this will simply cause both transfers to fail, but in the case reported in the ticket the transfer was incorrectly marked as successful, leading to a storage inconsistency at the destination.

Note on current handling of expiration times:

Probably the handling of expiration times should be simplified and made more consistent between the different agents.

Cheers Nicolo'