parca-dev / parca

Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.
https://parca.dev/
Apache License 2.0
4.1k stars 213 forks source link

Data disappeared after pod restart with Google Storage Bucket #3048

Open edmeister opened 1 year ago

edmeister commented 1 year ago

I'm setting up Parca for a POC, using the official Helm chart. I have configured a Google Storage Bucket for persistence, and I can tell the bucket is actually being used. So far, so good.

However, when the pod is restarted because of a change in configuration, I notice the data from before the restart is no longer retrieved from the UI when running search, although it still appears to be in the bucket.

The relevant part of my Helm values file looks like this: (the variables are terraform templating)

server:
  config:
    object_storage:
      bucket:
        type: GCS
        config:
          bucket: ${bucket}
          service_account: ${gcpServiceAccount}
        prefix: "/parca"

This is rendered in /var/parca/parca.yaml as this:

object_storage:
  bucket:
    config:
      bucket: my-redacted-bucket
      directory: ./tmp
      service_account: |
        {
          "type": "service_account",
          ...
        }
    prefix: /parca
    type: GCS

Is something wrong or missing in my config or is this a known issue? Thanks for helping out!

asubiotto commented 1 year ago

Could you also share the flags that you started the parca server with?

edmeister commented 1 year ago
      /parca
      --config-path=/var/parca/parca.yaml
      --log-level=info
      --cors-allowed-origins=*
      --storage-active-memory=536870912
asubiotto commented 1 year ago

Try adding --enable-persistence

edmeister commented 1 year ago

Ok, that seems to be expecting a writable volume inside the container. Mounting a PV is not yet supported by the Helm chart, and is something I would like to avoid (hence the GCS config).

level=error name=parca ts=2023-05-02T07:54:49.967128149Z caller=parca.go:236 msg="failed to open badger database for metastore" err="Error Creating Dir: \"data/metastore\" error: mkdir data: permission denied"                                                                                               
level=error name=parca ts=2023-05-02T07:54:49.967150909Z caller=main.go:66 msg="Program exited with error" err="Error Creating Dir: \"data/metastore\" error: mkdir data: permission denied"

I've set --storage-path to /tmp, so it will startup correctly, and now it seems to be creating timeline starting from the moment where I enabled the persistence flag, but the icycle graph returns an error.

image

The grpc call returns RpcError: read stacktrace metadata: read stacktraces: Key not found.

asubiotto commented 1 year ago

Seems like something that I've run into in the past: https://github.com/polarsignals/frostdb/issues/378. Does the error disappear after a while of running it?

brancz commented 1 year ago

hmm yeah this is a difficult one, we should probably store stacktraces in the columnstore directly, and not fail on reading from the metastore, then like you said Alfonso, it should recover over time

edmeister commented 1 year ago

Ok, I had reverted to a setup with ephemeral storage, since our devs got hooked to the tool in no-time, but I'll spin up a second instance and let you know if it recovers by itself if persistence is enabled.

Awesome product! We're very interested in beta-access to Polarsignals.