building continuous profiling platform base on parca.

i am an engineer. Recently we build a continous profiiling platform base on parca. I am excited to share it with you!

use clickhouse store stacktrace

parca's stacktrace is store in memory, frostdb.

problem

we found it have some problem:

high write Amplification

with 80 kb/s network in traffic, the machine memory is increase with 0.2 GB /s.

slow profileRange, profileType

this two api would cost 10~ seconds. profileType need to query all data from frostdb. profileRange need to query all samples's value to get totalvalue of this profile.

solution

so we try use clickhouse to store profile and stacktrace. we have two table, index table and stacktrace table.

index table store profileID, name and totalValue of this profile. <!DOCTYPE html>

profilingID	totalValue	labels.key	labels.value
D6HugNjT6dV	20000000	['instance','job']	['xxxxxx:12580','xxxxxx']

stacktrace table store profileID and stacktraceID. <!DOCTYPE html>

profilingID	value	stacktrace	timestamp
D6HugNjT6dV	524328	stacktrace	2022-09-28 10:20:03

cause one profile file could have thousands of stacktrace, so stacktrace table is much larger than index table.

query is very fast

now, the api profileRange, profileType only need to query index table. it take less one 100ms to get profileRange from thousands of series of profile. 截屏2022-10-27 上午11 17 01

it take less one 100ms to query single profile from clickhouse. 截屏2022-10-27 上午11 15 10

high compression ratio

the store compression ratio is more than 7. it compress 4 TB of raw data to about 500 GB.

┌─table──────────────────┬────marks─┬────────rows─┬─compressed─┬─uncompressed─┬─compression_ratio─┬─bytes_per_row─┬─pk_in_memory─┐
│ parca_index_local      │    48076 │    49040727 │ 4.43 GiB   │ 6.60 GiB     │              1.48 │ 96.99 B       │ 359.28 KiB   │
│ parca_stacktrace_local │ 16047052 │ 16432042504 │ 531.15 GiB │ 3.91 TiB     │              7.54 │ 34.71 B       │ 1.45 GiB     │
└────────────────────────┴──────────┴─────────────┴────────────┴──────────────┴───────────────────┴───────────────┴──────────────┘

visualization source and numLabel

we use github.com/google/pprof to visualization profile. it enable use to view source and numLabel. 截屏2022-10-27 上午11 22 06

use clickhouse to store metadata

the metadata was stored in badgerDB, which not enable us to deploy multi parca server.

and sometimes it cause more than 700% iowait cpu to compaction.(use 100GB SSD, used 30GB) 截屏2022-10-27 上午11 27 15

so we use clickhouse to store metadata(parca should only rely on one kind of store). and use a local cache to distinct key.

high compression ratio

compression ratio is more than 7.

┌─table────────────────┬─marks─┬─────rows─┬─compressed─┬─uncompressed─┬─compression_ratio─┬─bytes_per_row─┬─pk_in_memory─┐
│ parca_metadata_local │ 12100 │ 12371734 │ 1.72 GiB   │ 11.82 GiB    │              6.87 │ 149.36 B      │ 1.67 MiB     │
└──────────────────────┴───────┴──────────┴────────────┴──────────────┴───────────────────┴───────────────┴──────────────┘

great search speed

with about 5000 stacktrace, it take less than 1s to resolve stacktrace.

manual scrape

we add a manual scrape html. user input ip, port, endpoint and then click 'allocs'. and it will automatically redirect to parca.

parca-dev / parca