Surface area check should 'remove overlap' before checking

davelab6 commented 6 years ago

Currently diffenator has a simple heuristic for surface area checks, that has this result:

**glyphs 1 modified**
glyph               diff
Ecircumflex         131787.083333

This should not result in a difference, by "simply" running a (time intensive) remove overlap operation over glyphs before comparing them.

However, since the addition/removal of overlaps can be an oversight, both the current and future comparisons should be run, and if their results are different, then the 'diff' result should be annotated like that. Eg

**glyphs 1 modified**
glyph               diff            diff.removeOverlap
Ecircumflex         131787.083333   0

rsheeter commented 6 years ago

Why not render the damn thing and compare the rendering? - surface area can be identical but modified in different ways (pull two different points, translate the whole block in different directions, etc) to yield the wrong result.

anthrotype commented 6 years ago

Why not render the damn thing and compare the rendering?

that's what notodiff does https://github.com/googlei18n/nototools/blob/1498e738409a8fde3a157d00ad320ffb3500d498/nototools/shape_diff.py#L106

davelab6 commented 6 years ago

Why not render the damn thing and compare the rendering?

Because we want to know what the differences are.

A render diff is a 3rd kind in addition to an actual contour comparison (number of contours, area of contours) and a removeOverlap'd contour comparison.

that's what notodiff does

@m4rc1e Is notodiff is used in your overall 'dispatch' process?

m4rc1e commented 6 years ago

@rsheeter @anthrotype It's pretty slow and it relies on Harfbuzz.

A requirement was to make this as dependency free as possible. I do see your point and am happy to have such an approach enabled by default.

chrissimpkins commented 6 years ago

I spoke with Behdad a few months ago about using the StatisticsPen in fontTools to compare glyph center of mass + area to achieve a binary "does the shape differ?" across builds answer (and not where the shape differs). The glyph center of mass should be pretty sensitive to changes in contours/position but it sounded like there were limitations when overlaps are present in the current implementation of the pen.

Here is Behdad's comment in the module:

if the glyph shape is self-intersecting, the values are not correct (but well-defined). As such, area will be negative if contour directions are clockwise. Moreover, variance might be negative if the shapes are self-intersecting in certain ways.

Another possible approach to this problem, but maybe not one that gets around the removal of overlaps issue? Perhaps the value of that calculation is not important for diff purposes if it changes reliably in the presence of overlaps when the contour/position changes? That is math beyond my comprehension :)

That pen supports area calculations with similar overlap constraints for the value reported, though again it may be the delta rather than calculated value that is important for diff purposes.

davelab6 commented 6 years ago

Why not just run a removeOverlap() before running the StatisticPen over the glyph?

chrissimpkins commented 6 years ago

Why not just run a removeOverlap() before running the StatisticPen over the glyph?

My understanding is that this would address the problem. I was trying to address your concerns about overlap removal being a time intensive process across all glyphs in a set. Perhaps @behdad could weigh in? Unless you need the true calculated values without the overlaps (which you may not for a diff), you may be able to skip that step and simply calculate a delta of area & COM across build 1 v/ build 2. If there is a difference, then there is a change in shape (probably better stated as 'design' because these could be changes hidden under overlaps that are not visible in renders) or position. If there is not, then there is not. This gets at the question: did something change at the source through build process to final binary file?

When change is detected, this then warrants the next step of a rendered glyph diff for differences that reach the stage of visually perceptible change. At that stage, you begin to make platform/renderer and optics assumptions which makes for a less than perfect test but probably the best that we can do. Sometimes a human is necessary :)

davelab6 commented 6 years ago

I'm not sure diff'ing needs to care about run time. Its worth the wait?

chrissimpkins commented 6 years ago

For one off checks absolutely. If you are batching the entire GF collection in the setting of modifications in build tooling or for some other broad scale need then it probably matters a great deal.

chrissimpkins commented 6 years ago

But I am getting way ahead of my understanding here. I don't want to mislead you about the COM measure. Would be appropriate for Behdad or someone else versed in this to weigh in. Just an idea.

m4rc1e commented 6 years ago

I care about speed. This tool/lib will be used in the web app http://www.gf-regression.com. However, I’m open to having an option to compare rendered diffs.

chrissimpkins commented 6 years ago

However, I’m open to having an option to compare rendered diffs

@m4rc1e Another idea. If you are using an approach that renders to images, you might have a look at some of the image similarity software out there to achieve this. They are designed for information loss testing and a change in the rendered glyph should register as information change in these calculations (maybe? the RBGA considerations vs. greyscale data in glyph renders may require a different approach c/w those that I am aware of).

The SSIM measure is one that I know about and there is free, open source software available to compare pre/post PNG images:

https://pngquant.org/dssim.html (also see https://kornel.ski/dssim)

Compiled and fast executable + available as a library. Project is maintained by the pngquant author Kornel Lesinski. It would be very interesting to experiment and see if there are ways to define visually relevant changes at reading sizes in bodies of rendered text rather than in isolated glyph renders. Maybe way off the path for what your plans / needs are here, but an interesting, relevant question and possibly amenable to testing.

m4rc1e commented 6 years ago

@chrissimpkins thank you for reply. We already have a way to do this.

We control the web app using diffbrowsers. This tool allows us to take screenshots using browserstack.

Results can be seen in prs sent to google/fonts

https://github.com/google/fonts/pull/1475 https://github.com/google/fonts/pull/1452

chrissimpkins commented 6 years ago

Wow, that looks great Marc. Thanks for sharing it!

googlefonts / fontdiffenator

Surface area check should 'remove overlap' before checking #4