Profile RucioConMon memory

amaltaro commented 2 months ago

Fixes #

Status

Description

#### Is it backward compatible (if not, which system it affects?) #### Related PRs #### External dependencies / deployment changes

cmsdmwmbot commented 2 months ago

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
Python3 Pylint check: failed
- 7 warnings and errors that must be fixed
- 1 warnings
- 23 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 7 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15179/artifact/artifacts/PullRequestReport.html

cmsdmwmbot commented 2 months ago

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
- 7 tests deleted
Python3 Pylint check: failed
- 5 warnings and errors that must be fixed
- 21 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 12 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15180/artifact/artifacts/PullRequestReport.html

cmsdmwmbot commented 2 months ago

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests deleted
- 1 changes in unstable tests
Python3 Pylint check: failed
- 6 warnings and errors that must be fixed
- 26 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 11 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15197/artifact/artifacts/PullRequestReport.html

cmsdmwmbot commented 2 months ago

Jenkins results:

Python3 Unit tests: succeeded
- 7 tests deleted
Python3 Pylint check: failed
- 7 warnings and errors that must be fixed
- 21 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 10 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/15198/artifact/artifacts/PullRequestReport.html

amaltaro commented 2 months ago

I finally managed to do memory measurements with the code currently provided in this PR. It compares the current RucioConMon implementation in the WMCore stack in 2 scenarios: 1) fetching compressed data from RucioConMon (format=raw, generating line by line) 2) fetching uncompressed data from RucioConMon (format=json, loading the whole data in memory)

I ran these tests in vocms0259, such that I could measure memory usage in the node with the grafana host monitor. See screenshot below: memory_rucio_conmon

Some observations are:

even though the memory application barely change in format=raw (compressed), we can see that about 2.5GB of memory gets allocated in the "Cache" category. My guess is that it's related to the file written to the file system (when retrieving data through RucioConMon -> Services -> pycurl_manager)
the same "Cache" 2.5GB allocation happens in the format=json as well. But this time, the application memory footprint also jumps by almost 4GB.

As I was running memory_profiler as well, here is a breakdown of the format=raw vs `format=json, where we can see that the application memory barely changes in the raw/compressed implementation; but has GBs of memory footprint in the json one (no generator).

format=raw (also faster!)

(WMAgent-2.3.4.3) [xxx@xxx:install]$ python testRucioConMonMem.py 
2024-09-06 19:49:28,773:INFO:testRucioConMonMem: Fetching unmerged dump for RSE: T1_US_FNAL_Disk with compressed data: True
2024-09-06 19:49:28,788:INFO:testRucioConMonMem: Fetching data from Rucio ConMon for RSE: T1_US_FNAL_Disk.
2024-09-06 19:49:28,802:INFO:RucioConMon: Size of rseUnmerged object: 11888

2024-09-06 20:48:44,553:INFO:testRucioConMonMem: Total files received: 10877227, unique dirs: 12885
Filename: testRucioConMonMem.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    27     36.0 MiB     36.0 MiB           1   @profile
    28                                         def getUnmergedFiles(rucioConMon, logger, compressed=False):
    29     36.0 MiB      0.0 MiB           1       dirs = set()
    30     36.0 MiB      0.0 MiB           1       counter = 0
    31     36.0 MiB      0.0 MiB           1       logger.info("Fetching data from Rucio ConMon for RSE: %s.", RSE_NAME)
    32     68.6 MiB     30.6 MiB    10877228       for lfn in rucioConMon.getRSEUnmerged(RSE_NAME, zipped=compressed):
    33     68.6 MiB      2.0 MiB    10877227           dirPath = _cutPath(lfn)
    34     68.6 MiB      0.0 MiB    10877227           dirs.add(dirPath)
    35                                                 #logger.info(f"Size of dirs object: {asizeof.asizeof(dirs)}")
    36     68.6 MiB      0.0 MiB    10877227           counter += 1
    37     68.6 MiB      0.0 MiB           1       logger.info(f"Total files received: {counter}, unique dirs: {len(dirs)}")
    38     68.6 MiB      0.0 MiB           1       return dirs

2024-09-06 20:48:44,555:INFO:testRucioConMonMem: Done!

format=json:

(WMAgent-2.3.4.3) [xxx@xxx:install]$ python testRucioConMonMem.py 
2024-09-06 21:12:16,810:INFO:testRucioConMonMem: Fetching unmerged dump for RSE: T1_US_FNAL_Disk with compressed data: False
2024-09-06 21:12:16,825:INFO:testRucioConMonMem: Fetching data from Rucio ConMon for RSE: T1_US_FNAL_Disk.
2024-09-06 21:20:38,011:INFO:RucioConMon: Size of rseUnmerged object: 2812956952
2024-09-06 22:18:50,841:INFO:testRucioConMonMem: Total files received: 10877227, unique dirs: 12885
Filename: testRucioConMonMem.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    27     35.8 MiB     35.8 MiB           1   @profile
    28                                         def getUnmergedFiles(rucioConMon, logger, compressed=False):
    29     35.8 MiB      0.0 MiB           1       dirs = set()
    30     35.8 MiB      0.0 MiB           1       counter = 0
    31     35.8 MiB      0.0 MiB           1       logger.info("Fetching data from Rucio ConMon for RSE: %s.", RSE_NAME)
    32   2946.0 MiB     24.6 MiB    10877228       for lfn in rucioConMon.getRSEUnmerged(RSE_NAME, zipped=compressed):
    33   2946.0 MiB      3.0 MiB    10877227           dirPath = _cutPath(lfn)
    34   2946.0 MiB      0.2 MiB    10877227           dirs.add(dirPath)
    35                                                 #logger.info(f"Size of dirs object: {asizeof.asizeof(dirs)}")
    36   2946.0 MiB      0.0 MiB    10877227           counter += 1
    37     63.6 MiB  -2882.4 MiB           1       logger.info(f"Total files received: {counter}, unique dirs: {len(dirs)}")
    38     63.6 MiB      0.0 MiB           1       return dirs

2024-09-06 22:18:50,842:INFO:testRucioConMonMem: Done!

I will make sure these changes are reflected in https://github.com/dmwm/WMCore/pull/12059 and proceed with this development over there.

vkuznet commented 2 months ago

Alan, it seems to me that actual issue in accumulation of results in this for loop: for lfn in rucioConMon.getRSEUnmerged(RSE_NAME, zipped=compressed):. How about converting the code to generator and let client process it. You may measure the size of returned dirs and it may likely to be constant size you observe in grafana which is unavoidable. The JSON adds overhead to load the JSON data into the RAM.

amaltaro commented 2 months ago

The function rucioConMon.getRSEUnmerged(RSE_NAME, zipped=compressed) returns a generator to the client, and in this example the client is testRucioConMonMem.py (I am making a similar test to what MSUnmerged does here). With that said, this line:

for lfn in rucioConMon.getRSEUnmerged(RSE_NAME, zipped=compressed):

is in the correct place, as here will be the place to parse each lfn and decide what to do with them (on the client side). Please let me know if I misunderstood your comment though.

vkuznet commented 2 months ago

Alan, the issue is in RucioConMon and Service modules. Here is my insight into its behavior:

the RucioConMon relies on refreshCache API from Service module, see https://github.com/dmwm/WMCore/blob/3ea5301e7191107e646c114f71207a4e8265c67e/src/python/WMCore/Services/RucioConMon/RucioConMon.py#L73
the refreshCache put an object (list of lfns) into a file (see https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Service.py#L223) after getting the data from upstream service via getData call
the RucioConMon reads full content of that file when no zip encoding is specified, see https://github.com/dmwm/WMCore/blob/3ea5301e7191107e646c114f71207a4e8265c67e/src/python/WMCore/Services/RucioConMon/RucioConMon.py#L74, but
the underlying stream of data is read already into a RAM during getData call, see https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Service.py#L318
what is different in the call above is how data is treated when it is written to the cache file
- in case of binary=True it writes data as binary blob, https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Service.py#L336
- in case of binary=False it first loads the data via json.loads and then write it back to a file, https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Service.py#L339
- in either case the data is already in a RAM because of https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/Service.py#L318
- therefore it is unavoidable in both scenarios to get full data content in a RAM

My proposal is to stream data from makeRequest which should be converted to a generator, to preserve backward compatibility you need a new API for that. This way the code should not read data from remote server but rather reads it line by line and yields it back to a client. Also, to avoid converting data from list to a set (unique set of LFNs) you have two options: either make unique constrain on BE and avoid it on a client, or if you use cache file let the external process to handle uniqueness in a file, e.g. cat lfns.txt | sort -n | unique > new file.txt and switch your cache to a new file.

From my observation the current implementation of fetching data has unavoidable memory footprint due to loading data into python after HTTP call, and the larger HTTP response the larger memory footprint the code will deal with.

amaltaro commented 2 months ago

Valentin, you seem to have captured well the flow of an HTTP call through the Service parent class.

To add on what you described above, I think the actual data is loaded into memory at the most base class (pycurl_manager), at these lines: https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/pycurl_manager.py#L342-L343

which can then be automatically decompressed as well, if content is in gzip format. In that scenario, the cache file in the filesystem will not be binary as well, but it will be in the content-type from the response object (json, text, etc).

To really minimize the memory footprint, we would have to stream data from server to client, fetching 1 row each time. I don't know the exact details, but I guess the server would have to support a new-line (or similar) data streaming, the connection between client and server would have to remain opened until the client exhausts all the data in the response object.

This data-streaming is somehow a conflicting idea with the custom data caching we have implemented in the Services python package. So it needs to be carefully thought and implemented.

vkuznet commented 2 months ago

What you are looking for is NDJSON data-format which server must implement, basically it is list of JSON records separated by '\n'. Doing this way client can request such data (it can be in zipped format as well) and read one record at a time, similar to how CSV data-format is processed. And, the total amount of memory required for entire set of records will be reduced to a size of a single record.

cmsdmwmbot commented 1 month ago

Can one of the admins verify this patch?

dmwm / WMCore

Profile RucioConMon memory #12089

Status

Description