gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
75.94k stars 7.54k forks source link

Add pprof labels for better profiling insights #12820

Closed mschfh closed 2 months ago

mschfh commented 2 months ago

Add pprof labels for better CPU profiling insights

Background

When using pprof for CPU/memory profiling in Hugo (--profile-cpu, --profile-mem), the flamegraph output doesn't allow for a detailed breakdown to show which sections are actually consuming the most resources.

example flamegraph: image

The log with --printMemoryUsage also does not help to narrow down the issue, since it just prints overall memory usage:

Alloc = 2.0 GB
TotalAlloc = 101.8 GB
Sys = 3.3 GB
NumGC = 123

Feature Request

We propose adding pprof labels to Hugo's codebase. This addition would allow users to attribute CPU cycles and memory usage to specific formats/sections/kinds/templates, which is currently not possible.

Previous Work

A pull request (PR #8939) was previously submitted to implement this feature. Unfortunately, it was marked as stale and automatically closed before being reviewed.

For a detailed explanation of the issue and the proposed solution, please also refer to the author's blog post: https://www.markhansen.co.nz/pprof-tagroot/

Please consider reopening and reviewing PR #8939, or provide feedback on how to proceed with implementing this feature.

Thank you for considering this feature request, this enhancement would significantly improve Hugo's profiling capabilities, helping users optimize their build process.

cc @mhansen

bep commented 2 months ago

We propose adding pprof labels to Hugo's codebase.

Who is "we"?

mschfh commented 2 months ago

For such proposal to be considered it needs to come from one or more practical needs and from a Hugo developer. I don't see any of that in the above proposal from a GitHub user with zero commits in the Hugo repo.

While pprof profiles might not be the most user-friendly option, it would provide value to non-developers as well.

The practical need is debugging slow templates and high memory usage, which usually requires attributing the work done, in this case to certain templates/sections, --templateMetrics provides some insight into which template is slow, but it doesn't explain why.

An example would be this flamegraph: image

The issue was caused by an image partial that excessively used resources.GetMatch with globs, which caused repeated inefficient fs traversals.

With labels, this work would be attributed to the template: image (image source)

Who is "we"?

I'm assuming the initial PR author is also still proposing this.

mhansen commented 2 months ago

You're right to ask for practical use cases first. I don't think I made the case for use cases well enough before. Let me try now.

If Hugo developers are ever asked why someone's build is slow (I understand fast build times is a big selling point of Hugo), it would be helpful to be able to point users at some way to identify which template/page/section is responsible.

In https://www.markhansen.co.nz/attributing-go-cpu-profiles-with-pprof-tags/, I had a practical use case of figuring out why my build was slow, why iteration times were slow. I was able to identify why my build was slow: syntax highlighting. I decided it was worth it in the end and didn't remove the syntax highlighting, but it was nice to have the information needed to make that decision.

On Wed, 4 Sept 2024 at 05:11, mschfh @.***> wrote:

For such proposal to be considered it needs to come from one or more practical needs and from a Hugo developer. I don't see any of that in the above proposal from a GitHub user with zero commits in the Hugo repo.

While pprof profiles might not be the most user-friendly option, it would provide value to non-developers as well.

The practical need is debugging slow templates and high memory usage, which usually requires attributing the work done, in this case to certain templates/sections, --templateMetrics provides some insight into which template is slow, but it doesn't explain why.

An example would be this flamegraph: image.png (view on web) https://github.com/user-attachments/assets/60dec0ab-4e13-419a-bf76-75f230efa863

The issue was caused by an image partial that excessively used resources.GetMatch with globs, which caused repeated inefficient fs traversals.

With labels, this work would be attributed to the template: image.png (view on web) https://github.com/user-attachments/assets/17ab8aab-b263-462e-8a23-81b6c508f7c9 (image source https://www.markhansen.co.nz/pprof-tagroot/)

Who is "we"?

I'm assuming the initial PR author is also still proposing this.

— Reply to this email directly, view it on GitHub https://github.com/gohugoio/hugo/issues/12820#issuecomment-2327234949, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOOGKV3U5V6LBTQCW7LZUYCXRAVCNFSM6AAAAABNQ55PESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRXGIZTIOJUHE . You are receiving this because you were mentioned.Message ID: @.***>

mschfh commented 2 months ago

@bep Would you consider a new/rebased PR for this feature?

I believe this could also help with issues like https://github.com/gohugoio/hugo/issues/12654

bep commented 2 months ago

See my comment in https://github.com/gohugoio/hugo/issues/12654#issuecomment-2345427562

I'm closing this for now. Theres no evidence in the above showing that detailed pprof output is of any help debugging "slow builds". To the contrary, it just points to a very generic "render" func, which is where most work is being done in Hugo.

github-actions[bot] commented 1 month ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.