Open davelab6 opened 7 years ago
Well, at the moment we have clearly defined memory leaks, they are called caches ;-)
We should be freeing up memory as we go
see also:
There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton
(transfered from https://github.com/googlefonts/fontbakery/issues/1609#issuecomment-335632139)
It seems to me that we should be freeing up memory as we go.
Yeah, that's clearly due to the caching of the @condition
s I guess we'll have to be smart here, because not caching the ttx-objects will free ram but cause more i/o and fonttools parsing.
pytest has @fixture
which is very similar in concept as our @condition
, they can specify a scope
in which a fixture is kept alive.
But maybe something simpler will do, like a general dont-cache
flag on the CLI. This should be another issue.
... I can notice that the tests run slower and slower
Is it because of swapping?
Is it because of swapping?
May be. I am not sure, but that theory makes sense.
LOL very good :)
We should add a note that this is a known limitation / tradeoff from the caching we made, to the README, then :)
But it is not a matter of just documenting the caching behaviour. We should also try to reduce excessive caching that makes the program die or become sluggish.
We should also try to reduce excessive caching that makes the program die or become sluggish.
Sure we need to do something; but it's not easy. E.g. as the caching behavior is now it's totally fine to test one font-family because the RAM of all our computers is easily big enough. It's the fastest thing we can do, we have many reads on the same, cached ttfont objects. But In a case where we run one test for a big amount of fonts, i.e. the whole collection, the cache could basically be turned off, without any speed issues, as a ttfont is needed only once in the lifetime of the full test run and the memory problem would vanish.
So, our real world example shows us that caching is dependent on the actual use case. We may need to introduce different modes for caching that can be chosen from on a use case base.
Eg, the runner runner can pass an argument to moderate or disable the runner's caching?
On Oct 11, 2017 8:28 AM, "Lasse Fister" notifications@github.com wrote:
We should also try to reduce excessive caching that makes the program die or become sluggish.
Sure we need to do something; but it's not easy. E.g. as the caching behavior is now it's totally fine to test one font-family because the RAM of all our computers is easily big enough. It's the fastest thing we can do, we have many reads on the same, cached ttfont objects. But In a case where we run one test for a big amount of fonts, i.e. the whole collection, the cache could basically be turned off, without any speed issues, as a ttfont is needed only once in the lifetime of the full test run and the memory problem would vanish.
So, our real world example shows us that caching is dependent on the actual use case. We may need to introduce different modes for caching that can be chosen from on a use case base.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/googlefonts/fontbakery/issues/1610#issuecomment-335793293, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP9y4tw9hJx05Rl05Sxi3vAOB-sLkfKks5srLR-gaJpZM4P0sN2 .
Eg, the runner runner can pass an argument to moderate or disable the runner's caching?
Yeah, I'd go with an CLI argument first. We could try to infer a strategy from the test execution order, which we calculate before running the tests, but that's maybe overkill just now. But in short: we can know which caches are going to be used in the future of a test run and which are no longer interesting. Thus we could prune caches as soon as they are obsolete. This would be pretty smart, but still no solution for a case where all ttFonts are used at once. see the following.
I toyed a bit with this, turns out: com.google.fonts/test/040
is a very interesting case with many issues when it comes to testing the whole collection.
vmetrics
looks at all of the fonts at once. If we turn off caching for ttFont
and vmetrics
the execution time really gets worse because we still read all fonts once per test (and get_bounding_box
in vmetrics
seems to be expensive too; though since fonttools is reading lazily, this is hard to measure).vmetrics
for each test is worse than caching its result once in the whole collection case, caching vmetrics
and not caching ttFont
seems best for the whole collection scenario. ttFont
is heavy on the memory, but vmetrics
is heavy on the complexity (O(n) vs. O(n²) if not cached). Turning off caching completely will make this very slow.ttFont
and keeping it on for vmetrics
hits the sweet spot here for this test (since 8ce7663): memory stays low, exec time the sameThus, maybe the CLI argument can turn off caching for specific conditions, and that way make difficult test runs at least possible.
Now the good part:
Investigating this, I figured out that we had many cache misses for vmetrics
(had a new cache-key for each font, but was not using the current font) which made this test very slow for large sets of fonts and used a bit more memory. I addressed this in f8bef08 of #1612 This should speed up the overall normal test runs a bit as well.
In 8ce7663 I made "derived iterables" into generators, which is a precondition to make an effective CLI control
Hmm. It says
for ttFont in ttFonts:
So, when the root input to the runner is the collection-wide set of ttFonts, then this not intended; the vmetrics for loop's ttFonts list should be scoped to the superfamily, since its the superfamily that we want to have the same v metrics - such that swapping between families (eg Alegreya, Alegreya SC, Alegreya Sans, and Alegreya Sans SC), and between weights within any of those, doesn't change any line heights visually.
ttFonts in the vmetrics conditions is meant to be "all font files in a given family", so that the overall vertical metrics of the whole family is computed.
yeah, super-family
The collection-wide/super-family discussion is #1609 please talk about caching here and about the other thing there ;-)
(updated: issue reference above)
So, when the root input to the runner is the collection-wide set of ttFonts, then this not intended;
Right, but "collection-wide set of ttFonts" was not intended in the first place, because fontbakery was about testing a family at a time :-) this so #1609 but I can't let this slip uncommented. We are re-defining the scope of fontbakery, I hope you guys understand this.
We are re-defining the scope of fontbakery, I hope you guys understand this.
This scope was defined a long time ago:
https://github.com/googlefonts/fontbakery/blob/master/BRIEF.md#31-check-the-entire-collection
OK, but it was never implemented. I'm talking about the implementation and actual changes to it.
Mmm. Is everything from https://github.com/googlefonts/fontbakery/blob/master/BRIEF.md#2-onboarding-new-and-updated-families now working?
If not, we should postpone working on this, and Felipe should buy some RAM to stop it swapping ;)
2.5 seems missing, the rest is, if 2.3 is ignored (which we chose to I think)
We do have a few fontbakery checks actually performing regression testing (2.5), but not by comparing TTXs. Instead the test fetches production font files from GFonts and compare them to the local ones.
Should the caching of that stuff be improved?
(2.3 hotfixing tools have been moved out of the checker scripts to their own standalone scripts, and 2.5 comparing to avoid regressions, is done by marc's tool, and/or using the commands listed in the document.)
I suppose to clear the cache between families you'd need to 1. replace the iterargs etc. of a spec after each family run 2. reset the cache. Maybe doable from the CheckRunner? Family splitting can be done with something like https://github.com/googlefonts/fontbakery/issues/1904#issuecomment-395419260 adapted to OT. Or maybe you need to rip iterargs etc. from Spec and generate iterargs etc. in CheckRunner and pass them into any function that needs it.
Maybe we should make CheckRunner into a Family-CheckRunner (just a name change really, or even better don't change the name, just use the concept!), and add a CollectionCheckRunner, that handles creation and destruction of CheckRunners per (sub-)family, aggregation of results and also management of super-family/collection spanning conditions/values. That way, we get cache-invalidation and change of iterargs (when discarding a finished CheckRunner and instantiating a new one) and don't need to fiddle much with the working stuff.
- replace the iterargs etc. of a spec after each family run
AND
Or maybe you need to rip iterargs etc. from Spec and generate iterargs etc. in CheckRunner and pass them into any function that needs it.
A Spec is not changed during a CheckRunner execution, only CheckRunner changes its state (cache and it's reporters) and is instantiated with specific values
(prepared/described as ExpectedValues
in the Spec) for the run .
generate iterargs etc. in CheckRunner and pass them into any function that needs it.
Isn't it done in CheckRunner? All the information stored in a Spec ever for an iterag really is it's mapping, like 'font' => 'fonts'
. There are some methods though, that use information provided by CheckRunner for iterargs related stuff (generating the execution order), but that's not stored within the Spec.
Note: https://github.com/bloomberg/memray may help with identifying memory hogs. Sometimes, it's things we don't suspect.
Observed behaviour
Memory is allocated and not freed, since per https://github.com/googlefonts/fontbakery/issues/1609#issuecomment-335630171 when fontbakery checker progresses in the batch of 2400 TTF files in GF, human perception can see checks running slower and slower
Expected behaviour
We should be freeing up memory as we go
We should be running a code profiler before every release to make sure that we don't increase the memory requirements unexpectedly.
Resources and exact process needed to replicate
https://zapier.com/engineering/profiling-python-boss/ https://github.com/rkern/line_profiler etc