Clean cache feature request

sasobadovinac commented 7 years ago

I was thinking about an optimal solution for my case, running CI builds on appveyor, and have come to the conclusion that probably the best way would be if it would be possible to clean at the end of an build all those entries from the cache that were not used in the last build. I understand this is not useful for every one and that some need to keep older entries, for cases like mine however I believe keeping just the last build in the cache would be best.

This should give me the smallest cache, which is useful for quickly saving and restoring it on appveyor but also all the issues that I had in the past have happened when the cache got full. With the latest fixes and improvements clcache does seems to be now generally running very fine but with an full cache things still can get a bit funny and since running this as CI on appveyor it is very undesirable to push commits just to fix / reset clcache, so cleaning the cache continuously makes sense in this case.

So my idea is to set the cache size large enough, lest say twice the size of a normal build or a bit more, so that it can grow during the build, but gets then cleaned to minimum at the end of each build and saved for the next build.

Would it be possible to have this option?

frerich commented 7 years ago

Isn't this already possible? After a build, you could use clcache -M to set the cache size to 50% of it's original size and then clcache -c to trim the cache such that the least recently used objects are discarded.

sasobadovinac commented 7 years ago

Oh, so the original above idea was to somehow monitor the files and clean exactly those that were not used during the build, but I was thinking that if this was somehow to hard to do, that having just the possibility to set the percentage in the -c option (for example -c50 to clean 50% or -c30 to clean 30%) could possible also work... But I did not think of the possibility to use the existing -c in combination with -M like you described, setting -M high at the beginning of the build and setting it lower at the end before the clean. It does sound as it could work, I will do some tests... Thank you! :)

sasobadovinac commented 7 years ago

Unfortunately it is not working as expected and it is sort of a similar (same?) issue as with the full cache, when I said above that it starts to behave a bit funny. Since I am not that good and fast at reading the code (to understand how exactly clcache.py works in this case) I am giving here some assumptions about where I believe is the problem, but I might be wrong...

I guess simply cleaning the oldest entries is not the best strategy, since they might very well still be useful. This would make sense if during the build, the time stamp of the used existing entries would get updated, so then the oldest by time stamp would indeed be the not used ones. But from my tests this does not seem to be the case and it looks like the wrong entries (those that are still useful) get cleaned.

Here is the example of the last builds where the above strategy was added (cleaning 50% at the end of the build). We can see that the next build took longer, but it should not. And we can see that even after rebuilding from the same commit the build time does not drop https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.464 https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.465 https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.466

And here is an example where the cache is full (but it is large enough to hold the full build) still rebuilding from the same commit does not drop down the build time to the expected ~15min https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.404 https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.405 https://ci.appveyor.com/project/sasobadovinac/freecad/build/1.0.406

It would be expected for such things to happen in case where the set cache size is smaller then the size of the full build cache, but not in the above examples.

frerich commented 7 years ago

I guess simply cleaning the oldest entries is not the best strategy, since they might very well still be useful. This would make sense if during the build, the time stamp of the used existing entries would get updated, so then the oldest by time stamp would indeed be the not used ones. But from my tests this does not seem to be the case and it looks like the wrong entries (those that are still useful) get cleaned.

When cleaning the cache, clcache guess the atime (i.e. the date of last access) to determine the 'oldest' ones. So in principle, it should work just like you describe. I wonder whether maybe you're cleaning too much, or the atime is not actually set correctly on the AppVeyor VMs.

sasobadovinac commented 7 years ago

I have all this setup only on appveyor so cannot quickly test if that is just the case for appveyor VM. Did you test this, are you sure it works as it should on a normal system? Funny thing is for example that if I look at created, modified and accessed times of the general files on my win system created and accesses times are always the same :\ only modified is changed for some files, I guess those that have been modified :)

Build cache for 64x build is about 750Mb and about 500Mb for x86, all tests above should have the cache set at at least 1Gb, so that should not be the problem...

frerich commented 7 years ago

Did you enable setting the last access time? It's disabled by default on Windows Vista and Kater, see https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance/

sasobadovinac commented 7 years ago

Reading a bit about this on the internet it seems to be common (default) this to be turned off, to increase disk performance... I have doubt enabling this will work on appveyor

sasobadovinac commented 7 years ago

I will test this a bit but how about updating the ctime or atime or mtime from python? Would be more robust then relaying on a system setting that is off by default.

frerich commented 7 years ago

I'm not even sure it's possible to trigger the atime from a program if the operating system functionality is disabled.

It would surely be possible to keep track of the last access time for each cache entry, but of course that's an additional runtime cost which would slow down cache hits. Some profiling would be required to gauge the impact.

sasobadovinac commented 7 years ago

So, from the tests that I did it seems like enabling the system atime on appveyor will not work. As it seems that one has to reboot the system after setting the fsutil behavior set disablelastaccess 0...

On the other hand if it is somehow possible to update this times (from python for example), I don't really see why it would be a problem changing from using atime to ctime or mtime, if updating the atime would not work? Another way would indeed be to track this in some other way.

I was also thinking if this could also not be causing some of the other issues, since mostly they have shown up when the cache got full and since reliably cleaning the cache seems generally quite important for this application.

frerich commented 7 years ago

Improvements to how clcache tracks cache accesses are surely possible, but I think this is getting a little out of scope for this particular issue ticket. I wonder: is (re-)storing the cache (assuming it has a plausible size, i.e. not too large) really a significant slowdown for the build, or are there maybe better things to optimise? I'm not sure this feature is generally useful, e.g. one popular use case of clcache is when you're switching your source tree between two or more branches, e.g. a 'stable' and a 'development' branch. In such cases, it's common that there are very few objects shared between builds n and n+1, but when switching back (i.e. in build n+2) there is a lot of reuse.

So a feature which does what you suggest seems a little specific for your particular use case.

siu commented 7 years ago

It is quite surprising to learn that access times are not update by default on newer versions of windows. In that case the cleaning feature is not working as expected. I think it would make sense to document it in the README or in the caveats section, thoughts?

frerich commented 7 years ago

Yes, a PR which at least updates the README and/or the Wiki (the Caveats section) would be good until an alternative solution is available.

slodki commented 5 years ago

I'm not sure this feature is generally useful, e.g. one popular use case of clcache is when you're switching your source tree between two or more branches, e.g. a 'stable' and a 'development' branch. In such cases, it's common that there are very few objects shared between builds n and n+1, but when switching back (i.e. in build n+2) there is a lot of reuse.

So a feature which does what you suggest seems a little specific for your particular use case.

I set CLCACHE_DIR to something like c:\cachedir_%branch%_%compiler% when use clcache with containers at a CI platform to cache files related to a specific test-case only. Separate caches exist and are restored only for particular container configuration - cache entries are not mixed and not overwrite, cache size can be controlled for each build scenario.

One big cache is better for a local developer workstation but multiple separate caches per container/branch/whatever are better for CI platforms.

abdullahtahiriyo commented 5 years ago

Hi there,

I got this idea:

  - ps: fsutil behavior set disablelastaccess 0 # Enable Access time feature on Windows (for clcache)
  - msbuild FreeCAD_Trunk.sln /p:TrackFileAccess=false /p:CLToolExe=clcache.exe /p:CLToolPath=c:\Python37\Scripts\ /m
  - ps: fsutil behavior set disablelastaccess 1

from the bitcoin guys: https://github.com/bitcoin/bitcoin/blob/master/.appveyor.yml

and appears to be working:

clcache -s
clcache statistics:
  current cache dir         : Disk cache at C:\Users\appveyor\clcache
  cache size                : 791,913,646 bytes
  maximum cache size        : 1,073,741,824 bytes
  cache entries             : 1143
  cache hits                : 1136
  cache misses
    total                      : 1
    evicted                    : 0
    header changed             : 1
    source changed             : 0
  passed to real compiler
    called w/ invalid argument : 0
    called for preprocessing   : 0
    called for linking         : 0
    called for external debug  : 0
    called w/o source          : 0
    called w/ multiple sources : 0
    called w/ PCH              : 18

frerich / clcache

Clean cache feature request #253