Requirements

[X] Is this a bug report? For questions or discussions use https://lemmy.ml/c/lemmy_support
[X] Did you check to see if this issue already exists?
[X] Is this only a single bug? Do not put multiple bugs in one issue.
[X] Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.

Summary

Prometheus metrics currently have performance issues due to label explosion of the endpoint label. This is handled for some endpoints (like endpoint="/comment/{comment_id} instead of using the actual ID) but not for others (like endpoint="/feeds/u/user@instance.xml" and endpoint="/pictrs/image/image_uuid.webp").

Steps to Reproduce

Run pretty much any query involving lemmy_api_http_requests_duration_seconds_bucket. For instance histogram_quantile(0.99, sum by(le) (rate(lemmy_api_http_requests_duration_seconds_bucket[$__rate_interval])))
Observe the very slow performance. The query times out for me if I set it for the last 24 hours.

Technical Details

It's not a bug in the sense that logs will help. It's a performance bug, not a correctness issue. The system is working as designed, just with poor performance.

LemmyNet / lemmy

[Bug]: Bad Prometheus metric performance due to `endpoint` label explosion for images and feeds #4431

Requirements

Summary

Steps to Reproduce

Technical Details

Suggested solutions

Version

Lemmy Instance URL