dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
288 stars 136 forks source link

webdav: use transfermanager+id to identify TPC transfer #7555

Closed kofemann closed 5 months ago

kofemann commented 5 months ago

Motivation: The WebDAV door keeps track of TPC transfers based on transfer IDs generated by RTM. As RTM generates transfer ID based on the current timestamp, in deployments were multiple RTM are running, for two transfers that have started at the same point in time (with millisecond precision), then one of them will be lost. As soon as a first one completes, the second transfer becomes orphan:

Perf Marker
    Timestamp: 1713527368
    State: Running
    State description: Mover created
    Stripe Index: 0
    Stripe Start Time: 1713527348
    Stripe Last Transferred: 1713527348
    Stripe Transfer Time: 19
    Stripe Bytes Transferred: 0
    Stripe Status: RUNNING
    Total Stripe Count: 1
    RemoteConnections: tcp:127.0.0.1:9000
End
Perf Marker
    Timestamp: 1713527373
    State: Unknown transfer
    State description: Unknown transfer
    Stripe Index: 0
    Total Stripe Count: 1
End

Modification: Update RemoteTransferHandler to use transfermanager+id as transfer identity to avoid this ambiguity.

Result: no transfer id collisions

Fixes: #7548 Acked-by: Marina Sahakyan Acked-by: Svenja Meyer Acked-by: Dmitry Lirvintsev Target: master, 10.0, 9.2 Require-book: no Require-notes: yes (cherry picked from commit 412bfe2a33a3c58584b5447c47940b0b396511dd)

svemeyer commented 5 months ago

retest this please