followup: font perf - Githubissues

Thorin-Oakenpants commented 3 years ago

Ideas/ToDo

reduce fonts tested: remove font faces that can't be detected
- [x] mac/windows
- [x] android: nothing to do?
- [ ] linux
reduce fonts tested: mac/windows
- [x] RFP kBaseFonts
- [x] TB whitelist
reduce fonts tested: mac/windows
- remove fonts that should be there, but add nothing to sizes: e.g. Times vs Times New Roman
- remove fonts that are aliases (these may or may not have been found depending on release and font code changes = release version equivalency: and not found if RFP fontvis = equivalency)
  - e.g. MS Gothic + MS Mincho = ＭＳゴシック+ＭＳ明朝
  - MS PGothic = ＭＳＰゴシック
  - MS PMincho = ＭＳＰ明朝
- [x] remove bundled mozilla emoji fonts: equivalency of release version (PR coming)
- [ ] windows
- [ ] mac
  - [x] Avenir Next Demi Bold is unstable on ESR
reduce fonts tested: win/mac (android?)
- [ ] don't test for fonts that should be always be there: e.g. windows: Times New Roman, Arial, etc [edge case users who fuck this up I'm not concerned with, as long as I'm very careful)
- [ ] windows: detect win7 and skip some win8/10/11 fonts
reduce fonts tested: mac issues
- [ ] revisit mac full list
  - I thought this was done: but now it seems that some "faces" are found, so I moved them into the list
  - they may be false positives based on the same family regular font?
- [x] test/debug mac first run vs subsequent
  - on first load (cold boot?), sometimes the initial run differs from subsequent ones
  - this is not just mac, and is now covered under #171
- [ ] test/debug mac symbol
  - symbol font is found on HiDPI screens, but not on non-HiDPI
reduce styles applied and dimensions taken
- [x] only test subsequent baseFonts (monospace, sans-serif, serif) if not already found
- [ ] can we remove the third baseFont entirely (per OS)
  - TZP font tests are currently only for Firefox
  - font lists (for windows/mac) are only system fonts
  - Android find fonts in all three baseFonts: but is a small list anyway (most are found in first baseFont)
- [ ] allow fontlist PoC (the full test) to run on nonFF (get user to select the OS font list)
  - so we can see what happens on other engines
[x] reduce const base perf ? it creates a new const dimensions, which eats time
- ~~instead we could fold it into the fntList loop as a first item~~
  - no benefit: added a first font as a dummy font that sets the base value, but conditional checks in the fntList loop eat time and it also means we still have to deal with the other bseFontFull entries
- remove redundant/unsupported baseFonts - down from 13 to 8

original post

creepy fonts eats a lot of time: it depends on the size of the font list (set by detected OS) font list sizes: windows 468, linux 449, mac 755, droid 140

for example: if I load TZP and only run outputFonts() as windows (and I am windows, so the fonts exist), here's the logPerf results for that section: font list = 486

unicode glyphs [fonts]:  122 ms
              creepJS [fonts]:  427 ms
                 woff [fonts]:  687 ms

a few more restarts + tests: creepJS fonts results are : 422, 418, 418 some font section re-runs: 341, 343, 354, 342

now halve the font list (slice the array to 243 items)

cold starts: creepJS timings: 234, 228, 231
some font section re-runs: 176, 173, 172

diffs

font list: 50%
load 3 runs: 1267 - 693 = 574 saved = 45.3%
3 reruns: 1038 - 521 = 517 = 50%

So basically, outside of a little perf used in setting up elements etc: the perf is directly related to the size of the font list

Perf is also related to whether or not the font exists (it takes time to load a font)

e.g. if I use the android font list on my windows machine, I actually only have one of those fonts: Twemoji Mozilla
you would a 140 fonts (approx 30% the size of 486) to take 30% of the time
30% of 450'ish = 135
but it takes 102, 103, 105

on the subject of if the font exists it uses some time to load, thus affected perf, if I block document fonts, the perf is instantly 50% faster

anyway, this is a ticket for me to look at removing fonts from lists that we always expect the OS to have

windows: such as Arial, Times New Roman

Thorin-Oakenpants commented 3 years ago

godamn it .. 243 is not half of 468 .. it's 234, and no, I'm not dyslexic .. this is just what happens when you're tired (and overworked)

Thorin-Oakenpants commented 3 years ago

Hah ...

fonts

creepy font test uses 485 ... TZP uses 486: I suspect you're not including the old emoji font EmojiOne Mozilla: All my code is tested for FF60+ .. and wouldn't you know it, TwemojiMozilla.ttf replaced EmojiOneMozilla.ttf in FF61+ ... sucks they didn't sneak it in in time for ESR (or backport it to ESR dot releases)

Probably a moot point, since I might prune the font list. But if I do, I'll probably add a checkbox so reruns can use the full OS list

@abrahamjuliot : hope you are ok with me linking to the creepy stand-alone font test

Thorin-Oakenpants commented 3 years ago

Got a sweet local version here that only tests the base fonts if isRFP && !isTB && isVer > 79. I can do this for windows and mac (android is not covered, and linux is gated to a few distros).

So far I've done the windows one: it reduces the font list from 485 (I had a typo one in there by mistake) to 212, so less than half, saves lots.. in my case about 250ms

note: "base fonts" is the base list allowed by font vis (93 fonts in windows), plus system fonts (Times, MS Serif etc), plus FF bundled emoji fonts, plus any font family styles, plus the non-western char ones (which get ignored after 89 or 90) = hence 212

if you change the RFP pref and rerun the section or a global rerun, or run the font fallback test, it picks up the change in RFP. I had to fix the isRFP test (for now) to ignore most of the extra checks: i.e changing RFP does not update the css4 pseudo elements: but I have a plan to work around this

you can see this if you load the live TZP (don't use any extensions) - there should be no lies/bypasses etc: then toggle RFP, and do a global rerun: css psuedo elements come up as lies (prefers stuff, screen measurements) and also userAgent has a hernia (I knew about the userAgent one a long time ago) <-- this is why flipping RFP on/off during live sessions is a stupid idea, as not everything is runtime

fonts listed in the section header are the full font lists, and if the base font list is used, it notes that in the result (clickable). The 485 count indicates it's ~~the windows list~~ your OS. Below I loaded the test with RFP off, then I flipped RFP on and ran the font fallback test

basefonts

Mac base list is 183, full list is 755: will be interesting to see what I can prune that down to

I might expand this for isTB base lists for windows and mac

Thorin-Oakenpants commented 3 years ago

note: using base fonts would not catch out those who bypass the mechanism by bundling extra fonts: however, that would have been the case with non-OS fonts anyway

Thorin-Oakenpants commented 3 years ago

https://arkenfox.github.io/TZP/tests/fontlists.html

base test (with RFP) is to check both match: i.e is we don't trim something out that can be found (or if an upstream bug happens)
style test (best without RFP) is to check that all the styles trimmed out are not found: namely font family style entries like bold, bold italic, italic, regular
e.g. in windows this means we can drop at least 231 items from all, and base can be reduced from 212 to 114
still testing, but I assume this is how mozilla detects fonts on all platforms, by font family name only, not styles: so I should be able to apply this to mac and linux as well

currently the base + RFP is limited to windows currently the styles test only works for windows, Mac/Linux will be a false positive [0/0] since the style lists are empty until I build them - I really wish I had access to a Mac for testing

windows 7 win7-base

win7-styles

abrahamjuliot commented 3 years ago

Here's Mac...

Mac FF 89 (RFP + Base)

all: 06ca3165804ca63857d20d683ea79e1dd4a2ed91 [189/755] base: 63ccc987c5d5b94c53c39173ac4da7c6a081c096 [162/175] ✖

FOUND FONTS NOT IN BASE: [27]

Avenir Black, Avenir Black Oblique, Avenir Book, Avenir Heavy, Avenir Light, Avenir Medium, Avenir Next Demi Bold, Avenir Next Heavy, Avenir Next Medium, Avenir Next Ultra Light, Charter Black, Hiragino Maru Gothic ProN W4, Hiragino Mincho ProN W3, Hiragino Mincho ProN W6, Hiragino Sans GB W3, Hiragino Sans GB W6, Hiragino Sans W0, Hiragino Sans W1, Hiragino Sans W2, Hiragino Sans W3, Hiragino Sans W4, Hiragino Sans W5, Hiragino Sans W6, Hiragino Sans W7, Hiragino Sans W8, Hiragino Sans W9, SignPainter-HouseScript

BASE FONTS NOT FOUND: [13]

Apple Braille, Arial Hebrew, Arial Hebrew Scholar, EmojiOne Mozilla, Farah, Muna, Nadeem Regular, New Peninim MT, Raanana, Symbol, Twemoji Mozilla, Webdings, Wingdings 2

Mac FF 89 RFP disabled (Styles)

all: fe4c2b69cf90cf5b076feac6647e404bf7e34850 [219/755] styles: 85dac4f7a0cd3fed353356d637c7ddbab9733377 [5/297] ✖

FOUND FONTS IN STYLE: [5]

Avenir Black Oblique, Hiragino Kaku Gothic ProN W3, Hiragino Kaku Gothic ProN W6, Hiragino Mincho ProN W3, Hiragino Mincho ProN W6

abrahamjuliot commented 3 years ago

On a side note, I have a new promise-based function that detects a distinct set of fonts not detected by element measurements and it's blazing fast. Thus far, I'm testing on Windows, Ubuntu, Fedora, Mac, Chrome OS, Android, and iPhone.

For example, on Android, this detects Noto fonts and a set of Google fonts that vary:

Carrois Gothic SC
Coming Soon
Cutive Mono (Linux/Chrome OS Android 9)
Dancing Script
Droid Sans Mono
Noto Color Emoji (Linux/Chrome OS Android 9)
Noto Serif
Roboto
Roboto Condensed

In TB, this seems to shine the most in Linux and Android builds. Mac and Windows yield nothing in TB.

fontList = [...]

// map font list to a list of FontFace objects
fontFaceList = fontList.map(font => new FontFace(font, `local("${font}")`))

// load each font, settle each promise to 'rejected' or 'fulfilled', then reduce list to 'fulfilled' promises
Promise.allSettled(fontFaceList.map(font => font.load())).then(res => {
    const supportedFonts = res.reduce((acc, font) => {
        // supported fonts have a 'fulfilled' status
        if (font.status == 'fulfilled') {
            return [...acc, font.value.family]
        }
        return acc
    }, [])
    return console.log(supportedFonts)
}).catch(error => console.error(error))

based on FontFace() FontFace.load() Promise.allSettled()

Thorin-Oakenpants commented 3 years ago

Thanks for the Mac info - I'll decipher it and see what that means in terms of what constitutes a "font family"

Edit: Unfortunately I can't run a Mac VM with an AMD mobo. Otherwise I would, and I would install all optional language packs so I could then use full results to weed out non-detectable "faces" and then RFP to create base. This looks similar to windows - I can see some general rules of thumb, but with exceptions

Thorin-Oakenpants commented 3 years ago

I made the font lists more nuanced: fntAlways, fntBundled, fntTB - in both the test and TZP

TZP windows TB is now spiffy fast : only 81 fonts to check - test passes on win7 + win10
- I have to say the win7 vs win10 results are quite dramatic, so they're not really gaining anything
- win10 supports even less fonts = more tofu (presumably)
TZP windows RFP is 117 - test passes on win7 + win10
TZP windows now pegged at 249

Might be something weird going on in Win10 though, but I need to test more. e.g. in my VM, it picks up Franklin Gothic Medium and HoloLens MDL2 Assets (both of which are in the base list) when RFP is off, but not when it is on (but they're in the base list!!) - however, that was with FF81 (that's how long it's been since I did anything on my Windows10 VM), and maybe that's been fixed - I seriously need a break

can't wait to get mac sorted out

On a side note,

Yeah, I saw all those commits and font face fuckery .. wondered what you were doing. I plan to add some more manual font tests to TZP - such as domrect/textmetrics measuring methods. If font face can be used in FF as another method, cool.

Your test though, what is it meant to achieve? I can already tell the OS (although I do plan to use a tiny font test to determine say win7 vs win10, not sure about win8.1, in order to create my own UA with zero reliance on navigator: oscpu is about the only one I can't fully determine lies). TZP font tests are FF only. So I'm not sure how your test fits in here? Can you elaborate? .. have some 🥧

abrahamjuliot commented 3 years ago

I use the font face test to detect OS, but font measurements can acheive this too. It's non-essential. But, using font face, I only need 7 fonts in most cases. It works for webkit, blink, and gecko, but not TB on windows/mac. So, I'm measuring a max of 30 fonts (which works for TB) and then I load the 7 fonts in font face for Android and Linux. I'm still testing different Linux builds, but so far the 37 fonts together are 20-70 ms.

Thorin-Oakenpants commented 3 years ago

Ahh OK. TZP doesn't do a font tests for non-FF. But at some stage whacking in one that covers a small range for "some entropy" would be nice. And when Brave gets there's done, I want to add something for that - i.e render it as a stable value rather than randomized: could be as simple as ignoring a randomlist, who knows - wait and see what Peter comes up with

So is this to fully replace your current font detection list?

I just posted some shit at https://github.com/arkenfox/user.js/issues/1211 with some rough timing stats .. ps, got any friends with macs? also, rerun yours when you get time :)

abrahamjuliot commented 3 years ago

...is this to fully replace your current font detection list?

Yes, on the main page. The font test page will focus on large lists.

...friends with macs

Negative 😭. I'll send you my stats.

Thorin-Oakenpants commented 3 years ago

Yes, on the main page

Hmmm. Can extensions fubar it? The seven measuring methods in TZP mean we can pick up BS. Domrect measuring increases the methods. Textmetrics needs some work I think (different counts) - but on TZP I decided seven was enough (for now)

perf wise, I don't think much is gained per method drop, as the bulk of the time is taken by checking each font

That said, if font face is untamperable and faster than for this metric then 👍

stats

You can post new test results at https://github.com/arkenfox/user.js/issues/1211 - I added a kBaseFonts integrity check - for example on all windows versions, Franklin Gothic Medium is a base font, but with font vis or RFP on, it doesn't get picked up. Would love to see if Mac shows anything

abrahamjuliot commented 3 years ago

Can extensions fubar it?

Yes, but it's a delicate task and requires extra work (similar to new Date). I can see it getting blocked or deleted over an attempt to rewrite the API.

Thorin-Oakenpants commented 3 years ago

Sounds good :)

abrahamjuliot commented 2 years ago

fyi - version 1.0 font in Windows 11

Segoe Fluent Icons

https://docs.microsoft.com/en-us/typography/fonts/windows_11_font_list

Thorin-Oakenpants commented 2 years ago

Yes. I also need to update Mac fonts

Thorin-Oakenpants commented 2 years ago

@abrahamjuliot

So I think I've hit on a way to speed up the current font test

see this

            const baseFonts = ['monospace','sans-serif','serif']

when I run it with only one item in the array, here's what I got

    // each baseFont takes about 55/65ms, and combined about 150/160
    // so excluding some overhead, each baseFont is 1/3rd

    // 061036d2becc235167e3038392bed69bbbe2bb08 150fnts all-3
    // 66e9690baae5fbef2478fae8fb62e0935160c0b9 148fnts monospace
        // misses: Consolas, Twemoji Mozilla
    // 66e33b0be5d2de69cc14f99e6bc237b38e6a99f3 147fnts sans-serif
        // misses: Arial, Helvetica, Small Fonts
    // 57fb601120016eaba69079b3c5067b5daff996b5 145fnts serif
        // misess: MS Serif, Mongolian Baiti, Roman, Times, Times New Roman

That's out of 251 windows fonts

But we adding to a Set, we only need to detect each font once

original = 251 fonts x 3 baseFont = 753 measurements
for each font
- if we detect the font in "monospace", then we can skip the other baseFonts (serif, sans-serif, etc)
- else check the next baseFont
so in my example
- we would measure 148 times on monospace
- we would measure the remaining 103 effectively 3 times each (101 of them won't be detected) = 309
- total 457 measurements = 457/753 = 61%, so roughly :star: 1.5x as fast :star:
actual perf will diffs will depend on if you have the font or not, it saves time by not loading a missing font, but adds time to measure (which in the old code you were doing anyway)

^^ edited my math, my bad

so starting here https://github.com/arkenfox/TZP/blob/cd4d401790c3c2f5e638bc80544c793a4992b911/js/fonts.js#L364

family is a combined font + baseFont, e.g. Arial monospace, and we need to loop each baseFont

and this line https://github.com/arkenfox/TZP/blob/cd4d401790c3c2f5e638bc80544c793a4992b911/js/fonts.js#L368

is required if any of the eight detectedVia hasn't yet been found

So this only works if we loop each baseFont per font and all eight need to be found (which works if no-one is screwing with fonts). I've looked at how I would code it, but I think you could come up with something more elegant, plus it;s your original code

ps: I was looking at collecting the sizes with each font, per baseFont, but almost no font has multiple sizes (two on my windows 7), and this way I could actually collect the data like this below (because we only get the first baseFont, this is for info purposes), but the dimensions add entropy to the metric (probably equivalency), but I like this idea. And IDK, but if we use clientRect we might get more precision and maybe subpixel entropy as well (dimensions would change with zoom etc), but it's not hard to return the fonts names as a separate hash

detectedViaPixelSize.add(font +" : "+ basefont +" : "+ dimensions.width +" x "+ dimensions.height)

Anyway, I am dead keen to get this done. I don't think we need a test PoC with two run buttons, old vs new - but if you want one, we can do

Thorin-Oakenpants commented 2 years ago

actually, we don't need to check if all 8 methods detected a font, we just need to check that at least one did (the others should, if supported), so if they don't then they're being blocked or whatever (rechecking them with subsequent baseFonts won't do anything)

Thorin-Oakenpants commented 2 years ago

@abrahamjuliot ^ done - https://github.com/arkenfox/TZP/commit/b95b12737e8a70e6342806268e831853066ed2d5 - how does that look to you?

Thorin-Oakenpants commented 2 years ago

I toggled RFP off in tor browser (it still uses a reduced list cuz .. smart code) but now you can get actual perf: nice! (glyphs is another area for perf attention, just not in this ticket)

tbwin

Thorin-Oakenpants commented 2 years ago

https://abrahamjuliot.github.io/creepjs/tests/fonts.html

I know this is a test PoC and not the main creepy test, but a) perf question (see below) and b) what fonts are you testing, can you add that to the console

So ... I do all 4 pixel and all 3 length tests on TZP, with no font protections, getting 150 detected on TZP same as creepy PoC page - I used to take about 150ms for all seven tests, now down to abut 100ms

here on creepy, well look at the pic
why's that
- are you running and timing Pixels and Lengths as separate functions (whereas mine is all in one, so no double the setting style etc)?
- is the font list being tested bigger? edit: mine is 251

creepfont

abrahamjuliot commented 2 years ago

actual perf diffs will depend on if you have the font or not

We could determine what OS fonts to check based on the user's reported platform. If the user spoofs the platform, then we give them that OS font list (the wrong list), measure the performance and see how many fonts match up.

We could even make special routes knowing full well the platform is a lie and then add additional font measurements or more OS font lists for that user based on the severity of the lie.

Might be a bad idea for devs. It would make testing some extensions a slower experience (maybe not that bad).

Thorin-Oakenpants commented 2 years ago

actual perf diffs will depend on if you have the font or not

I'm actually going to do a test for this. I have 86 fonts which I have that I can block in FF via the font vis pref (the same one that RFP uses). So I will test those 86 fonts only with and without them allowed in web content. And I will also test them both with three baseFonts checked vs 1 baseFont checked (when allowed). Will be interesting to see what the cost/benefit of not loading a font vs having to do three checks is

I already set a font list based on OS, and I edited the original post with more steps I can do, so so it's listed. The less fonts tested the faster. One thing I've noticed so far, at least on windows, is the third basefont doesn't seem to be needed - so that would save a lot - e.g. if 187+ fonts are not found (font vis pref with no RFP using default win 251 list - 64 found) .. that's a large chunk

obviously the third (and second) check depends on how many fonts aren't detected, so something like the tb whitelist list we're down to like 10-14 fonts missing, so fuck all to gain

edit: if we knew ALL fonts that had to use a non-monospace style, we could actually get away with one test per font, even if it's not detected

abrahamjuliot commented 2 years ago

b95b127

This looks very nice. Just a few comments.

Should there be a return after each case the font is detected and then skip the remaining dimension checks

isDetected = true
return

Small suggestion here. We could return early to reduce nesting. Either way works.

if (isDetected) {
  return
}
const family = "'"+ font +"', "+ basefont
span.style.setProperty('--font', family)

https://github.com/arkenfox/TZP/blob/b95b12737e8a70e6342806268e831853066ed2d5/js/fonts.js#L361-L365

Here's FF100 on Chrome OS Android 9.

Thorin-Oakenpants commented 2 years ago

Should there be a return after each case the font is detected and then skip the remaining dimension checks

is that a question? Do you mean like this


        if (dimensions.sizeWidth != base[basefont].sizeWidth ||
            dimensions.sizeHeight != base[basefont].sizeHeight) {
            // record in SIZE set
            isDetected = true
            return
        }
        if (dimensions.scrollWidth != base[basefont].scrollWidth ||
            dimensions.scrollHeight != base[basefont].scrollHeight) {
            // record in SCROLL set
            isDetected = true
            return
        }

        // stats
        baseFontTests[basefont]++
        if (isDetected) {
            baseFontDetected[basefont]++
            if (basefont !== basefontFirst) {
                oTempBaseFonts[basefont].push(font)
            }
        }
        return

am I missing something, if we return on the first dimension check, then the others will never populate, and I'll never record the stats, etc

edit: and I wouldn't pick up on differences between each detectedVia method if it's being affected by an extension (blocked etc) - e.g. with cydec my code returns two methods as null (from memory) .And it also wouldn't let me pick up on code changes in Moz that might cause a diff, or pick up my faulty code? Am I missing something here?

Thorin-Oakenpants commented 2 years ago

Here's FF100 on Chrome OS Android 9.

interesting... on that 3rd baseFont - I haven't had a chance to test FF on android yet. I think any decision on the 3rd baseFont being skipped is going to have to be heavily tested (and then even if I miss a font or two due to lack of testing, at least the PoC is the same for all - might even add entropy/equivalency)

PS: if you click the little mini hash, it will ~~debug~~ log those 5 fonts to console (1 for sans-serif, 4 for serif)

abrahamjuliot commented 2 years ago

are you running and timing Pixels and Lengths as separate functions

Yes, for comparison, I'm running each separate with the fpjs base + extended list (485 total). FontFaceSet and FontFace are async, so I load the full collection with them here, which includes TZP platform fonts and Google fonts (2927 total). I will add these to the console.

abrahamjuliot commented 2 years ago

if we return on the first dimension check, then the others will never populate

Ah, I see. It's good then.

Thorin-Oakenpants commented 2 years ago

Ah, I see. It's good then.

just edited my comment with more reasons. thought I had missed something simple in my logic .. I need MOAR coffee

Thorin-Oakenpants commented 2 years ago

Yes, for comparison, I'm running each separate

right, so you can time them as two separate test types. I don't see the benefit of that. I mean you could mix or match any of those seven, it doesn't mean anything really, does it?

You could run it as a single function and then split the results, just return a common perf, i.e on Pixel console line say perf xxx [with length] and on Length console line say perf xxx [with Pixel]

and add in my perf win for some MASSIVE savings

Thorin-Oakenpants commented 2 years ago

I unified the tor browser detection and multiple debugging fields into one, and added alerts to it. And it just keeps updating (it gets reset when you you a global rerun), so e.g. you can zoom, rerun the the screen section, and the dpi calculations are appended etc. Alerts are still kept in a global array (snapshot in time) and a red alerts link shows up at the top. And all alerts now also console.error - alerts are sanity checks for code or to pick up on unexpected results

I've been using that perf and debugging table at the bottom for outputting android shit when troubleshooting (I'm unable to debug via about:debugging#/setup and I can't be arsed sorting it out). Anyway, now I can add as many alerts, and on-screen debugs as I want .. to infinity and beyond

obligatory pic unfied

The point being that I debugged the fonts stats data there, so now you can see what those fonts are on android. I got 2 as well on my device, in the 3rd baseFont

abrahamjuliot commented 2 years ago

I wouldn't pick up on differences between each detectedVia method...

This makes sense. I was thinking to skip checking the further via methods since we got the goods, but diff checking provides more entropy.

Thorin-Oakenpants commented 2 years ago

random thoughts time

One of the perf items to consider in OP, is to reduce the list of fonts, e.g. fonts that are always expected (e.g. Arial on windows). Less fonts to load and test = less work. Of course that doesn't help the additional styles being set + dimensions being measured in subsequent baseFont (because they would have been detected first baseFont). Just saying this is one of the options

But now I am collecting the sizes, so I need to be careful: but I'm guessing that any tiny diffs caused by e.g. clearType, devicePixelRatio and other factors would probably already create max entropy in the fonts we do test. It's a little unknown and a classic perf vs payoff scenario. Obviously we don't want to cull fonts too far

Which brings me to the subsequent baseFont tests... if the total number of fonts found on baseFont[1] and higher is really tiny, maybe those fonts and/or the subsequent baseFont(s) can be dropped altogether

we also make that platform dependent
lets say on windows, we test all 251 fonts and have them found
- and sans-serif only adds 4 fonts and then serif after than only ever adds 3 fonts
- then we could remove those 7 fonts from the windows list: less fonts to test
- and we could just not do subsequent baseFont tests: removes all additonal tests if not detected, because we would have detected it already (from our robust testing of all 251 fonts in windows)

So I can see another 30-40% perf gains here: we could effectively trim a handful of fonts and eliminate all but one baseFont

Thorin-Oakenpants commented 2 years ago

@fxbrit can you do a mac test please. Don't give we washy stuff, just load it, run it, reload it, run it a dozen times until you get a stable result (am still thinking about your wonky mac fonts, which is how I came across this perf win - thinking too fucking hard)

edit: I just want to know what fonts you get in the debug section in the footer

Thorin-Oakenpants commented 2 years ago

I was thinking to skip checking the further via methods since we got the goods

but we've already gathered the dimensions for all methods, right? So I don't think you'll gain much by skipping a few if x == y's and Set.add's - IDK, maybe it saves 5ms. Not sneezing at it, papercuts are important

^^ edit: if that was subsubstantial, it might be possible to check after the 2nd font, that results are the same in each method, and if so set a flag that all are equal, and continue to collect just one (and later report them as all the same) - sounds dodgy TBH ;)

Thorin-Oakenpants commented 2 years ago

or .. we split fonts by baseType per platform, which would allow us to not discard fonts just because of their baseFont. This would be the ultimate setup IMO, once it is known which baseFont every font belongs, then only untested fonts (like non-system ones or system fonts we haven't been able to ever get test) would do extra checks (default) if undetected

If you add collect metrics per baseFont on creepy, we could analyze it

fxbrit commented 2 years ago

Thorin-Oakenpants commented 2 years ago

lols, all those font section reruns .. well done. Now that's looking good for even more evidence that fuck all fonts are detected ~~outside monospace~~ after the first one (but I want to compile data with monospace first in all cases)

Thorin-Oakenpants commented 2 years ago

hmmm

Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale Alejandro Gómez-Boix, Pierre Laperdrix, Benoit Baudry 2018

Before deploying our script in production, we identified a limitation in how JavaScript font probing operates. We found out that some fonts can have the exact same dimensions as the ones from the fallback font. Figure 1 illustrates this problem. In the example, the two tested fonts are metrically comparable and have the exact same width and height. However, they are not identical as it can be seen in the shapes of some of the letters (especially “e”, “a” and “w”). This means that font probing here will report incorrect results if one were to ask Times New Roman on a system with the Tinos font installed (or vice versa). To fix this problem, we measured the dimensions of a div against three font style variants. There are different typefaces that can be used by a web browser with the most popular ones being serif, sans-serif, monospace, cursive and fantasy. We chose the first three and we tested each font against the three of them, resulting in 66 ∗ 3 = 198 different tests. This way, we avoid reporting false negatives as the three fallback fonts have different dimensions.

Phew ... we ~~can't get~~ reduce the chances of getting a false negative, because we check each style if !isDetected

Edit:

so this suggests we shouldn't reduce style setting + dimension getting any further
but, I did increase the font size rather massively (that did improve accuracy) - so I wonder if we can reduce further: maybe it would just be a few edge cases and add entropy, or maybe any false negative would always be stable
note: amiunique and coveryourtracks give me false positives, so there - but it's consistent/stable

abrahamjuliot commented 2 years ago

What if we remove the use of Math.round? I think I added that to ignore tampering noise, but such noise can be detected and put to good use in the diff analysis.

// instead of pixelsToInt...
const pixelsToNumber = pixels => +pixels.replace('px','')
const originPixelsToNumber = pixels => 2*pixels.replace('px', '')

I wonder if adding something like transform: scale(1.0001) to the CSS will give us better precision and affect the results?

Thorin-Oakenpants commented 2 years ago

IDK about the math bit, but I already thought (and had a quick attempt two days ago) at transforming to force decimals in fonts

like I did here: https://arkenfox.github.io/TZP/tests/lineheight.html
I did that because of https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/40919
I am not sure, but subpixels (devicePixelRatio !== 1) etc may add entropy

I'm using this and going to expand it into other element tests https://github.com/arkenfox/TZP/blob/2e19aef91c2d2f7385cea9b1213ecd74c8462186/css/index.css#L309-L311

Thorin-Oakenpants commented 2 years ago

PS: I'm also grabbing the base data, since it's already run, but expanding it

            const baseFontsFull = [
                'none','monospace','sans-serif','serif','cursive','fantasy','fangsong',
                'system-ui','ui-monospace','ui-rounded','ui-serif','math','emoji'
            ]

            // base: all your base are belong to us
                // should we trap type mismatches for each baseFont

            const base = baseFontsFull.reduce((acc, font) => { // <--- changed to baseFontsFull
                span.style.setProperty('--font', font)
                const dimensions = getDimensions(span, style)
                detectLies.compute(dimensions)
                acc[font] = dimensions
                return acc
            }, {})

obligatory pic yup_thats-right_you-heard-me

Thorin-Oakenpants commented 2 years ago

What if we remove the use of ...

I did a heap of testing on perf (will post the rest of it and finish it another day), see below. So looking at what I did, I should be able to do a single method test and remove the math.round and other stuff, and it still hits around the 90's for me then that's not going to make much differnce

ORIGINAL TESTING NOTES (but I haven;t finished)

OK, I think I know exactly where almost all the time is being spent

method

commented out detectLies (I don't use it)
comment out the stats collection
just time the fntList.forEach(font => { loop (no other code)
when testing a single method
- comment out the other six dimension checks and the other six dimensions in getDimensions

e.g. when just testing detectedViaTransform

```js const dimensions = { //width: pixelsToInt(style.width), //height: pixelsToInt(style.height), transformWidth: originPixelsToInt(transform[0]), transformHeight: originPixelsToInt(transform[1]), //perspectiveWidth: originPixelsToInt(perspective[0]), //perspectiveHeight: originPixelsToInt(perspective[1]), //sizeWidth: pixelsToInt(style.inlineSize), //sizeHeight: pixelsToInt(style.blockSize), //scrollWidth: span.scrollWidth, //scrollHeight: span.scrollHeight, //offsetWidth: span.offsetWidth, //offsetHeight: span.offsetHeight, //clientWidth: span.clientWidth, //clientHeight: span.clientHeight } return dimensions // and let t0font = performance.now() fntList.forEach(font => { let isDetected = false // reset each font baseFonts.forEach(basefont => { if (isDetected) { return } const family = "'"+ font +"', "+ basefont span.style.setProperty('--font', family) const style = getComputedStyle(span) const dimensions = getDimensions(span, style) //detectLies.compute(dimensions) /* if (dimensions.width != base[basefont].width || dimensions.height != base[basefont].height) { detectedViaPixel.add(font +":"+ dimensions.width +" x "+ dimensions.height) isDetected = true } if (dimensions.sizeWidth != base[basefont].sizeWidth || dimensions.sizeHeight != base[basefont].sizeHeight) { detectedViaPixelSize.add(font +":"+ dimensions.sizeWidth +" x "+ dimensions.sizeHeight) isDetected = true } if (dimensions.scrollWidth != base[basefont].scrollWidth || dimensions.scrollHeight != base[basefont].scrollHeight) { detectedViaScroll.add(font +":"+ dimensions.scrollWidth +" x "+ dimensions.scrollHeight) isDetected = true } if (dimensions.offsetWidth != base[basefont].offsetWidth || dimensions.offsetHeight != base[basefont].offsetHeight) { detectedViaOffset.add(font +":"+ dimensions.offsetWidth +" x "+ dimensions.offsetHeight) isDetected = true } if (dimensions.clientWidth != base[basefont].clientWidth || dimensions.clientHeight != base[basefont].clientHeight) { detectedViaClient.add(font +":"+ dimensions.clientWidth +" x "+ dimensions.clientHeight) isDetected = true } */ if (dimensions.transformWidth != base[basefont].transformWidth || dimensions.transformHeight != base[basefont].transformHeight) { detectedViaTransform.add(font +":"+ dimensions.transformWidth +" x "+ dimensions.transformHeight) isDetected = true } /* if (dimensions.perspectiveWidth != base[basefont].perspectiveWidth || dimensions.perspectiveHeight != base[basefont].perspectiveHeight) { detectedViaPerspective.add(font +":"+ dimensions.perspectiveWidth +" x "+ dimensions.perspectiveHeight) isDetected = true } */ /* skip stats // stats baseFontTests[basefont]++ if (isDetected) { baseFontDetected[basefont]++ if (basefont !== basefontFirst) { oTempBaseFonts[basefont].push(font) } } */ return }) }) let t1font = performance.now() log_debug("", t1font-t0font) ```

Then I would load TZP, and after the initial page load (we can ignore this time, as it is almost always slightly longer), I would then run the font section a dozen times (I used nightly: also console closed which IMO anecdotally can cause perf issues: it has a lot going on), and here's the results

                all seven: 105 100 101 102 100 101  99  99  99  97  98  98
         detectedViaPixel:  97  94  93  94  92  94  91  91  94  92  92  92
     detectedViaPixelSize:  97  94  93  92  93  92  93  91  92  92  92  93
        detectedViaScroll:  92  92  91  90  90  90  90  89  89  90  89  91
        detectedViaOffset:  93  92  92  92  90  91  90  90  90  89  90  89
        detectedViaClient:  94  91  90  91  89  88  89  90  90  92  89  91
     detectedViaTransform:  95  94  93  92  93  92  89  89  90  91  92  91
   detectedViaPerspective:  92  92  92  91  91  91  92  90  90  90  89  93

I won't bother averaging, as it's pretty clear that additional methods here aren't really adding anything

Next, with all seven methods used, I did these

... I did another five things so far to narrow down where all the time is spent

abrahamjuliot commented 2 years ago

https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/40919

This reminds me, have you noticed NoScript affects dom rect results for a few months now. It's somewhat jittery and then stable after a few reloads. I'm not sure, but it seems to be a result of appending elements, but only the dom rect is affected.

Thorin-Oakenpants commented 2 years ago

MORE TESTING NOTES (but I haven;t finished)

I repeated the test for just detectedViaOffset and made it the same as before with commented out stuff but this time I also did these, since they aren't being used

in other words the only diff in these tests is those two lines

    //const pixelsToInt = pixels => Math.round(+pixels.replace('px',''))
    //const originPixelsToInt = pixels => Math.round(2*pixels.replace('px', ''))

here's the old test vs the new test, which will give an idea of the cost

     [ToInt] detectedViaOffset:  93  92  92  92  90  91  90  90  90  89  90  89
  [ToNumber] detectedViaOffset:  93  91  91  90  91  92  89  90  89  90  90  88
                          diff:   -   1   1   2   1  -1   1   -   1  -1   -   1 : total 6

That's 6ms faster over 12 runs = .5ms per test on on average. And replacement lines will still eat time. I don;t think it's worth the change for perf sakes

What if we remove the use of Math.round? I think I added that to ignore tampering noise, but such noise can be detected and put to good use in the diff analysis.

So the question is do we want to do it for more precision(?) ... what do you mean tampering noise? I want as much stable entropy as I can get

Thorin-Oakenpants commented 2 years ago

what do you mean tampering noise

Ahh, so I tired it, and I see what you mean: I didn't check the differences in measurements. So it splits the results (for me) into three - those that don't use pixelsToNumber, those that use originPixelsToNumber and those that don't use either

Since I'm expecting them to all be the same, and I need 4/7 to be the same to determine a result, I end up flummoxed and return a red lie. This is my lie detector - comparing sets. I think yours would have the same issue

noise

But ... I could still use this. Add more methods (the perf cost is very cheap it seems) and Sets, use the ToInt ones for lies, but display and show and record the ToNumber

And get transform:scale going

I'm going to open a new issue on these things and keep this one for perf

Thorin-Oakenpants commented 2 years ago

MORE TESTING NOTES (but I haven't finished)

which is faster: detected or missing?

detected: I made sure I only used those found on the first baseFont
missing: I used real font names from my OS, but I guess you could use 86 bs names
missing: I forced isDetected = true regardless just before the return (after stats): so only one fontBase would run

  86 detected fonts: 21  20  19  19  20  20  19  19  19  19  18  20  19
   86 missing fonts: 15  17  15  15  15  14  15  15  14  14  14  14  15  

           detected: 86/86 | 0/0 | 0/0 | total: 86/86
            missing:  0/86 | 0/0 | 0/0 | total: 0/86

So my non-expert conclusion is that the cost of loading a supported font vs not loading it is quite small. Say at best 1ms per 10 fonts. Now that comes with some caveats: since we never set any detected fonts, IDK if any font fallback needed to occur, whereas if the previous style was a detected font, then I suspect nothing was gained - my oh-so-IANAE-about-fonts gut instinct tells me the slight perf seen above is because of that.

And the other caveat is that of course a missing font triggers additional baseFont tests, wiping out any possible gain

Edit: I could add 40 of each into a set of 80 and test in two orders: detected-then-missing vs alternate, but I think I'm so over this as far as info goes on which is faster :)

...

Thorin-Oakenpants commented 2 years ago

MORE TESTING NOTES (almost finished)

Next, with all seven methods used, I did these: note I left the stats enabled (maybe that adds 1ms perf)

// ignore: this is just us resetting the style between each font? it made no diff to perf or results
    span.style.setProperty('--font', family)
    //const style = getComputedStyle(span)
    const dimensions = getDimensions(span, style)

// don't get dimensions
    span.style.setProperty('--font', family)
    const style = getComputedStyle(span)
    const dimensions = {} // <-- NEVER COMPUTE ANY DIMENSIONS

// don't setProperty
    //span.style.setProperty('--font', family) // <-- NYAH NYAH
    const style = getComputedStyle(span)
    const dimensions = getDimensions(span, style)

results (all seven = original baseline)

                all seven: 105 100 101 102 100 101  99  99  99  97  98  98 : 
          dimensions = {}:   4   2   3   3   2   3   2   2   2   3   2   3 : 251/251 | 0/0 | 0/0
      no styleSetProperty:   6   6   5   4   5   5   5   4   4   4   3   4 : 251/251 | 0/0 | 0/0

So that tells me in my IANAE mode, that the time is all spent in a combo of font-family changing and then the measuring having to wait for the change to happen: if there is no pending change, the dimensions are cached so super fast. And if we don't measure, then setting the font-family isn't held up and can go full tit

Which leads me to an idea. We may never have to change any baseFonts (just reset them), and only set each font once. This would in theory limit setProperty to a maximum of the fntList.length. We still measure and the measurement may change, so IDK if this adds any perf improvements. I will explain a little later

Thorin-Oakenpants commented 2 years ago

OK, so here are our testing elements (it happens to be "ＭＳＰ明朝 monospace" in the pic because that is the last font tested in my windows list, and it was found in monospace)

div

Here is my idea: instead of a single span (font-fingerprint-detector), we have one for each baseFont (for now just hardcode the three). And we set the font on the parent, but only reset the baseFont each font

loop fontList

reset style on parent (zero perf cost) = setProperty(style)
reset <generic-name>/baseFont on each child (hopefully zero perf cost because it's not changing)
set font on parent, which is inherited by children
loop each baseFont
- measure until found
- measuring the first takes time
- measuring subsequent hopefully super fast (the font finished)

If I am correct. this would effectively wipe out baseFont[1+] perf costs, but still allow us to test them (and we can ignore trying to pigeon hole fonts into per baseFont pigeon holes as a possible solution)

Or I could be totally wrong. I want to test it

@abrahamjuliot I'm not sure how to structure this, and get dimensions wold need to know what span to target, and IDK about the inheriting and if we use a span or divs

<div id="font-fingerprint"> // do we even need this div
      <style> blah blah</style> // as per pic : reset styles between fonts

  <div id="font-fingerprint-detector" style="--font: Arial"> // use a div
      <style> content: "mmmWWWWWllliiiiiiiiimmmWWwhatever"</style>

      <span id="font-fingerprint-monospace" style="monospace"> // reset baseFont between fonts
          ::after
      </span>

      <span id="font-fingerprint-sans-serif" style="sans-serif"> // reset baseFont between fonts
          ::after
      </span>

      <span id="font-fingerprint-serif" style="serif"> // reset baseFont between fonts
          ::after
      </span>

   </span>
</div>

arkenfox / TZP

followup: font perf #34

^^ edited my math, my bad

ORIGINAL TESTING NOTES (but I haven;t finished)

MORE TESTING NOTES (but I haven;t finished)

MORE TESTING NOTES (but I haven't finished)

MORE TESTING NOTES (almost finished)