Open roland-sipos opened 9 months ago
Indeed the problem comes from the transfer interface implementation of bittorrent, related to transfer finish. To be investigated:
Problem(s) found in logfile /tmp/rsipos/pytest-of-rsipos/pytest-0/run0/log_snbclient_4338.txt:
2023-Oct-06 09:59:25,304 WARNING [void dunedaq::snbmodules::TransferInterfaceBittorrent::do_work(std::atomic<bool>&) at /nfs/sw/rsipos/DUNE/Sept/snb-NFD23-10-03/sourcecode/snbmodules/src/common/transfer_interface_bittorrent.cpp:214] BittorrentPeerDisconnectedError: Peer disconnected output_localhosteth0_1.out peer [ 10.73.136.79:52191 client: libtorrent 2.0.9 ] disconnecting (TCP) [sock_read] [asio.misc]: End of file (reason: 0)
2023-Oct-06 09:59:26,316 WARNING [void dunedaq::snbmodules::TransferInterfaceBittorrent::do_work(std::atomic<bool>&) at /nfs/sw/rsipos/DUNE/Sept/snb-NFD23-10-03/sourcecode/snbmodules/src/common/transfer_interface_bittorrent.cpp:214] BittorrentPeerDisconnectedError: Peer disconnected output_localhosteth0_2.out peer [ 10.73.136.79:53395 client: libtorrent 2.0.9 ] disconnecting (TCP) [sock_read] [asio.misc]: End of file (reason: 0)
May be linked to m_done flag in do_work method. This flag stop the bittorrent client but is not handled properly. It looks like the first file finish before the others are added to the client, making it stop...
Possible fix: giving the number of files to transfer instead of counting them when added to trigger the stop of the client. See fork branch with possible fix here
During the integration testing, the bookkeeper reported unfinished transfers for multiple-file transfers.
On the other hand, the seed client correctly reports that every file was successfully uploaded once. Data files are also present on the destination client. but without the resume torrent file:
This seems to indicate a problem how the downloader/leach client is finishing and reporting the transfers in case the transfer consists of multiple files.