Profile codebase and reduce memory hogging

davelab6 commented 7 years ago

Observed behaviour

Memory is allocated and not freed, since per https://github.com/googlefonts/fontbakery/issues/1609#issuecomment-335630171 when fontbakery checker progresses in the batch of 2400 TTF files in GF, human perception can see checks running slower and slower

Expected behaviour

We should be freeing up memory as we go

We should be running a code profiler before every release to make sure that we don't increase the memory requirements unexpectedly.

Resources and exact process needed to replicate

https://zapier.com/engineering/profiling-python-boss/ https://github.com/rkern/line_profiler etc

graphicore commented 7 years ago

Well, at the moment we have clearly defined memory leaks, they are called caches ;-)

We should be freeing up memory as we go

see also:

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

(via)

graphicore commented 7 years ago

(transfered from https://github.com/googlefonts/fontbakery/issues/1609#issuecomment-335632139)

It seems to me that we should be freeing up memory as we go.

Yeah, that's clearly due to the caching of the @conditions I guess we'll have to be smart here, because not caching the ttx-objects will free ram but cause more i/o and fonttools parsing.

pytest has @fixture which is very similar in concept as our @condition, they can specify a scope in which a fixture is kept alive.

But maybe something simpler will do, like a general dont-cache flag on the CLI. This should be another issue.

... I can notice that the tests run slower and slower

Is it because of swapping?

felipesanches commented 7 years ago

Is it because of swapping?

May be. I am not sure, but that theory makes sense.

davelab6 commented 7 years ago

LOL very good :)

We should add a note that this is a known limitation / tradeoff from the caching we made, to the README, then :)

felipesanches commented 7 years ago

But it is not a matter of just documenting the caching behaviour. We should also try to reduce excessive caching that makes the program die or become sluggish.

graphicore commented 7 years ago

We should also try to reduce excessive caching that makes the program die or become sluggish.

Sure we need to do something; but it's not easy. E.g. as the caching behavior is now it's totally fine to test one font-family because the RAM of all our computers is easily big enough. It's the fastest thing we can do, we have many reads on the same, cached ttfont objects. But In a case where we run one test for a big amount of fonts, i.e. the whole collection, the cache could basically be turned off, without any speed issues, as a ttfont is needed only once in the lifetime of the full test run and the memory problem would vanish.

So, our real world example shows us that caching is dependent on the actual use case. We may need to introduce different modes for caching that can be chosen from on a use case base.

davelab6 commented 7 years ago

Eg, the runner runner can pass an argument to moderate or disable the runner's caching?

On Oct 11, 2017 8:28 AM, "Lasse Fister" notifications@github.com wrote:

We should also try to reduce excessive caching that makes the program die or become sluggish.

Sure we need to do something; but it's not easy. E.g. as the caching behavior is now it's totally fine to test one font-family because the RAM of all our computers is easily big enough. It's the fastest thing we can do, we have many reads on the same, cached ttfont objects. But In a case where we run one test for a big amount of fonts, i.e. the whole collection, the cache could basically be turned off, without any speed issues, as a ttfont is needed only once in the lifetime of the full test run and the memory problem would vanish.

So, our real world example shows us that caching is dependent on the actual use case. We may need to introduce different modes for caching that can be chosen from on a use case base.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/googlefonts/fontbakery/issues/1610#issuecomment-335793293, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP9y4tw9hJx05Rl05Sxi3vAOB-sLkfKks5srLR-gaJpZM4P0sN2 .

graphicore commented 7 years ago

Eg, the runner runner can pass an argument to moderate or disable the runner's caching?

Yeah, I'd go with an CLI argument first. We could try to infer a strategy from the test execution order, which we calculate before running the tests, but that's maybe overkill just now. But in short: we can know which caches are going to be used in the future of a test run and which are no longer interesting. Thus we could prune caches as soon as they are obsolete. This would be pretty smart, but still no solution for a case where all ttFonts are used at once. see the following.

I toyed a bit with this, turns out: com.google.fonts/test/040 is a very interesting case with many issues when it comes to testing the whole collection.

https://github.com/googlefonts/fontbakery/blob/f96d45e1b009583d23bfa1aeb4e7c54bdada85ee/Lib/fontbakery/specifications/googlefonts.py#L1395-L1426

Bad thing is, it makes no sense to run it for the whole collection ("Checking OS/2 usWinAscent & usWinDescent"), so this is purely to gather a better understanding and to find bugs.
It reliably fills up the memory if run for the whole collection. vmetrics looks at all of the fonts at once. If we turn off caching for ttFont and vmetrics the execution time really gets worse because we still read all fonts once per test (and get_bounding_boxin vmetrics seems to be expensive too; though since fonttools is reading lazily, this is hard to measure).
The point is, that re-running vmetrics for each test is worse than caching its result once in the whole collection case, caching vmetrics and not caching ttFont seems best for the whole collection scenario. ttFont is heavy on the memory, but vmetrics is heavy on the complexity (O(n) vs. O(n²) if not cached). Turning off caching completely will make this very slow.
Turning off caching for ttFont and keeping it on for vmetrics hits the sweet spot here for this test (since 8ce7663): memory stays low, exec time the same

Thus, maybe the CLI argument can turn off caching for specific conditions, and that way make difficult test runs at least possible.

Now the good part:

Investigating this, I figured out that we had many cache misses for vmetrics (had a new cache-key for each font, but was not using the current font) which made this test very slow for large sets of fonts and used a bit more memory. I addressed this in f8bef08 of #1612 This should speed up the overall normal test runs a bit as well.

In 8ce7663 I made "derived iterables" into generators, which is a precondition to make an effective CLI control

davelab6 commented 7 years ago

Hmm. It says

for ttFont in ttFonts:

So, when the root input to the runner is the collection-wide set of ttFonts, then this not intended; the vmetrics for loop's ttFonts list should be scoped to the superfamily, since its the superfamily that we want to have the same v metrics - such that swapping between families (eg Alegreya, Alegreya SC, Alegreya Sans, and Alegreya Sans SC), and between weights within any of those, doesn't change any line heights visually.

felipesanches commented 7 years ago

ttFonts in the vmetrics conditions is meant to be "all font files in a given family", so that the overall vertical metrics of the whole family is computed.

felipesanches commented 7 years ago

yeah, super-family

graphicore commented 7 years ago

The collection-wide/super-family discussion is #1609 please talk about caching here and about the other thing there ;-)

(updated: issue reference above)

graphicore commented 7 years ago

So, when the root input to the runner is the collection-wide set of ttFonts, then this not intended;

Right, but "collection-wide set of ttFonts" was not intended in the first place, because fontbakery was about testing a family at a time :-) this so #1609 but I can't let this slip uncommented. We are re-defining the scope of fontbakery, I hope you guys understand this.

davelab6 commented 7 years ago

We are re-defining the scope of fontbakery, I hope you guys understand this.

This scope was defined a long time ago:

https://github.com/googlefonts/fontbakery/blob/master/BRIEF.md#31-check-the-entire-collection

graphicore commented 7 years ago

OK, but it was never implemented. I'm talking about the implementation and actual changes to it.

davelab6 commented 7 years ago

Mmm. Is everything from https://github.com/googlefonts/fontbakery/blob/master/BRIEF.md#2-onboarding-new-and-updated-families now working?

If not, we should postpone working on this, and Felipe should buy some RAM to stop it swapping ;)

graphicore commented 7 years ago

2.5 seems missing, the rest is, if 2.3 is ignored (which we chose to I think)

felipesanches commented 7 years ago

We do have a few fontbakery checks actually performing regression testing (2.5), but not by comparing TTXs. Instead the test fetches production font files from GFonts and compare them to the local ones.

davelab6 commented 7 years ago

Should the caching of that stuff be improved?

(2.3 hotfixing tools have been moved out of the checker scripts to their own standalone scripts, and 2.5 comparing to avoid regressions, is done by marc's tool, and/or using the commands listed in the document.)

madig commented 6 years ago

I suppose to clear the cache between families you'd need to 1. replace the iterargs etc. of a spec after each family run 2. reset the cache. Maybe doable from the CheckRunner? Family splitting can be done with something like https://github.com/googlefonts/fontbakery/issues/1904#issuecomment-395419260 adapted to OT. Or maybe you need to rip iterargs etc. from Spec and generate iterargs etc. in CheckRunner and pass them into any function that needs it.

graphicore commented 6 years ago

Maybe we should make CheckRunner into a Family-CheckRunner (just a name change really, or even better don't change the name, just use the concept!), and add a CollectionCheckRunner, that handles creation and destruction of CheckRunners per (sub-)family, aggregation of results and also management of super-family/collection spanning conditions/values. That way, we get cache-invalidation and change of iterargs (when discarding a finished CheckRunner and instantiating a new one) and don't need to fiddle much with the working stuff.

replace the iterargs etc. of a spec after each family run

AND

Or maybe you need to rip iterargs etc. from Spec and generate iterargs etc. in CheckRunner and pass them into any function that needs it.

A Spec is not changed during a CheckRunner execution, only CheckRunner changes its state (cache and it's reporters) and is instantiated with specific values (prepared/described as ExpectedValues in the Spec) for the run .

generate iterargs etc. in CheckRunner and pass them into any function that needs it.

Isn't it done in CheckRunner? All the information stored in a Spec ever for an iterag really is it's mapping, like 'font' => 'fonts'. There are some methods though, that use information provided by CheckRunner for iterargs related stuff (generating the execution order), but that's not stored within the Spec.

madig commented 2 years ago

Note: https://github.com/bloomberg/memray may help with identifying memory hogs. Sometimes, it's things we don't suspect.

fonttools / fontbakery