more data transfer bottlenecks, solve with caches

painter1 commented 4 years ago

In issue #122 over six months ago, I proposed adding an index to improve data transfer performance. This was helpful, but as the database grew from 4 GB then to 22 GB now, performance became an even greater problem. For two weeks the longer-term (a day or more) transfer rate never got above 70 MiB/s and was often around 45. This is with, usually, 60 or more parallel transfers running at all times; 3 to 8 per data node. At all times "synda watch" showed that most files were not actually being transferred: the "Current size" was either "0 Bytes" or the same as "Total size". Clearly, almost all transfers spent most of their time waiting for something to happen.

A typical file can be downloaded in several seconds; some less than a second and a few in tens of seconds. The problem can easily be seen in the time for a cycle through the event loop: almost always 20 to 80 seconds, usually a little under a minute. This is what the transfers are waiting for.

What is slowing things down is three database reports. Here I shall describe them, together with how I solved the problem by caching. There is an alternative solution method which demands much more coding. I shall describe it as a separate issue. Ideally both would be implemented.

In sdfiledao, identify a file with highest priority, among those with 'waiting' status and a specified data_node. The standard code sorts all the files, by priority and checksum (I don't know the purpose of including the checksum; perhaps reproducibility for debugging purposes.) This can be sped up by finding the highest priority, then searching for a match. For "searching for a match", see item 3 below. For "finding the highest priority", performance will be much better if the highest priority is cached. There are four cases where the cached priority will have to be recomputed; we need code for this in the first three cases:
- when no 'waiting' file exists with the cached priority (the cached priority will be reduced)
- when file is added to the database as 'waiting' and has a higher priority
- when a file with 'error' status is retried and has a higher priority
- when a priority is raised manually The cache of max(priority) has to be stored in a database because different processes may change its value; e.g. the daemon and a "synda install" process. The cache is very small, so referencing it has a trivial performance cost.
In sdfilequery, find all the data_nodes with 'waiting' files. This is in the function transfer_running_count_by_datanode which I had added this some time ago in order to limit the number of parallel transfers per data_node, necessary for performance reasons. This list can be deduced from the same max(priority) cache which was used for #1 above.
In sdfiledao, get a file with 'waiting' status and the specified data_node and priority. Although this is significantly expensive, there is hardly any additional cost to getting 100 files (if available). So get 100 files and cache the list in memory. It doesn't matter much if the cache is lost due to a crash or becomes obsolete due to a change in max(priority). (This could be done without bothering to cache max(priority); but then we would still need another cache for item 2.)

When I made the above changes, the overall download time went up to the 210-280 MiB/s range. That is, our data transfer rate increased by a factor of 3 to 7. Those who have smaller databases, smaller backlogs of 'waiting' files, or run fewer parallel downloads, will see less of an improvement, maybe much less. But I think that performance will be better in most cases, and never worse.

I am currently cleaning up the code changes and will issue a pull request when satisfied with them.

painter1 commented 4 years ago

The changes are in a branch getfilescaching (off of master). I'll issue a pull request after doing further testing, in a context where there aren't any other changes in the same code.

AtefBN commented 4 years ago

Noted, thanks Jeff!

painter1 commented 4 years ago

Testing finished; sorry about the multiple delays.

AtefBN commented 4 years ago

No problem @painter1 had my head down on refactoring and porting synda on py3 anyway. Thanks for the PR i'll review and merge asap !

ESPRI-Mod / synda

more data transfer bottlenecks, solve with caches #132