cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.08k stars 4.3k forks source link

MaxMemoryPreload monitoring shows noticeable increase when large number of packages are rebuilt #46359

Open makortel opened 2 weeks ago

makortel commented 2 weeks ago

Observed in https://github.com/cms-sw/cmssw/pull/46200#issuecomment-2394356850.

makortel commented 2 weeks ago

assign core

cmsbuild commented 2 weeks ago

New categories assigned: core

@Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 2 weeks ago

cms-bot internal usage

cmsbuild commented 2 weeks ago

A new Issue was created by @makortel.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

makortel commented 2 weeks ago

Here is an example from https://github.com/cms-sw/cmssw/pull/46204#issuecomment-2392400600

image
makortel commented 2 weeks ago

With @Dr15Jones we hypothesized maybe the FileInPath could lead to this increase?

Dr15Jones commented 1 week ago

Another reason could be the PluginManager cache would also include reading from the workspace's lib directory and therefore increase its working memory.

makortel commented 1 week ago

Another example from https://github.com/cms-sw/cmssw/pull/46388#issuecomment-2417639671

image
makortel commented 1 week ago

Interestingly https://github.com/cms-sw/cmssw/pull/46241#issuecomment-2420668031, that was a test with full_cmssw = true (I checked both compilation log and unit test report to see that many packages were indeed rebuilt), does not show noticeable increase image

makortel commented 3 days ago

I was able to reproduce locally a "rebuild of most of the universe in a developer area" leading to ~30 MB higher max memory used than "empty developer area".

makortel commented 3 days ago

I ran IgProf for both cases, below are the MEM_LIVE after 10 events

and the cause is in edmplugin::CacheParser::read(), see baseline vs rebuild. I believe it is this overload https://github.com/cms-sw/cmssw/blob/b96fd024cd7818d2b3bd5ba01d3a3075943f3233/FWCore/PluginManager/src/CacheParser.cc#L120-L122

makortel commented 3 days ago

So the cost is that the developer area's .edmplugincache becomes large, and we read that in memory in addition to the .edmplugincache from the release area?

In my "rebuild" developer area the .edmplugincache is 1.3 MB, whereas the $CMSSW_RELEASE_BASE/lib/slc7_amd64_gcc12/.edmplugincache is 1.5 MB.

makortel commented 3 days ago

I wonder if there would be any easy way to exclude the contribution from CacheParser::read() (other than starting the monitoring later in the job)

makortel commented 3 days ago

Instrumenting some of the PluginManager code, I see (in my rebuild area) that the job reads

So with nearly all of CMSSW build, most of the plugins have 3 entries in the CategoryToInfos: local, local poisoned, and release.