Open zeroby0 opened 3 years ago
I don't think anyone actually profiled it, so we don't know if the slow bit is loading/parsing the assets, or if it's the tracing of the font-related CSS properties for every HTML element. If it's the latter, we should look into optimization ideas, and memoization/caching could certainly play a part. If the HTML and all the CSS hasn't changed, we know that the tracing will yield the same result as last time.
I see!
Profiling seems worth doing. I'm new to the code base, but I'll try to do it. I have an idea :D
Great news! @papandreou
So I have done some profiling, and, 1 font file takes almost the same time to process as 1 HTML file. Given most projects use 3-4 fonts at most, but have dozens of HTML files, memoization should help greatly.
Number of files | HTML | Woff2 |
---|---|---|
1 | 3.3 | 3.3 |
2 | 3.6 | 3.4 |
4 | 3.8 | 3.8 |
8 | 4.0 | 3.8 |
16 | 4.5 | 4.4 |
To measure HTML file processing time, I generated 11 folders, with 2^n,n=[0..10] html files and 1 font each. All the html files use the same font. For font processing time, I generated 11 more folders, with 2^n,n=[0..10] font files and 1 html file each. The html file uses all the 2^n fonts.
Then in each of these folders, I ran npx subfont *.html -ris --dry
and time
d it. I repeated this timing 5 times per folder, and the variance in the times is what you see as error bars in the plot.
All the processing was done in a ramdisk. The font used is Inter 400 Regular woff2. Of course, CSS files may have slightly different processing time, but the big picture is the same.
Here is a zip of the workspace I profiled with. rfile.txt and rfont.txt contain times for file and font. file.py and font.py generate the folders used for profiling.
Actually we probably shouldn't / can't use git because we might be processing build artifacts of another stage in the pipeline, and they aren't tracked in git.
So we should maintain a hash table with the file hash and a shallow asset graph that includes inlined assets, but not other files. And then construct the whole tree from the memoized bits and calculated bits.
Hmm, it would be interesting to also node --prof
what the "per HTML" execution time is spent on. If it's the HTML/CSS parsing, working out the CSS cascade, or if it's the tracing of the font-related CSS properties per HTML element. I suspect it's the latter, and if that's the case, then you're right -- we could load the assets and compute that hash, then memoize the result of the trace with that as the key.
I've read in the issues that spidering through the files is what takes the longest time.
Can we use save the results of spidering in Netlify cache, and use git to figure out changes and spider only the changed files in the next build?