Unable to run hashicorp vault with "file" storage backend

robertocarvajal commented 7 months ago

Is this a regression?

No

Description

I was able to setup a local hashicorp vault with "file" storage backend, it created the default wallet correctly but failed when adding the first DID peer.

Please provide the exception or error you saw

m3-issuer-prism-agent-1   | Response body: {"errors":["1 error occurred:\n\t* open /vault/data/logical/b6134311-005b-2744-477b-259c225c78fe/a6f1a24f-4693-d250-d371-8ddab72208fb/metadata/8GdHsvboNsrDeGnP8MxCiyIhR8WajBIMitq7GgVfQhS6J0OCSdPmTw1cwCget3uDGsY
5CzhlAPKxey42dZ1on8zfCB7wE6zMt28pNKoGuzwKSyOziWD9Cdi9L/5kKXPnw9EiktVFBnVeYCl6q6IPgZ3KR1okxcdXQKXz5xlatNFW7VOP1ED2VLY1/5kKQkEa8jsLZQu1PkOjMbe5Nnk7wzlBaXU0YIlNiUKc5S4qSzJLsBem7VdTbGJ/7ujkCh6BErqSYoCKo7OS4vi3EtMjj6Sn49mTgJtWKCBA9fuXyzXT6HqOF7SN8QvxuS9ANLoS
pKXurRdcMSImLyTlziMTL7PeDKOklmrzUiRhHHyNprd91W87yUZ3WVP8zxjqP6CtZIfN4JEA03jZsHyMxutYmfuke0KjXcSW2D4HMHtc1s7CIPb369sk0pmYvRO0jaWwhucaS704LcYaqaWyRxaAgIe372aSjUSM91SUgQm9xRw4Ye2XpHnDOrKGDihCKu7OESA9XX588pQOWAR5esvSYkotnK9WqjQ5hPUFi6RX6iuvCeRIyoe8baSVvUI1L
R0pGdRECaP1T2UfTbVQYzybJ7eNDVUWeFr8EV8vJc9EhSumuk5WOvhUqDSgV8WXvgWvBANJ0bAMVU9Kq87NQGyeN7pdhbmEFS/5kL62X9U1wD59d3GAHB9DTCINqOWwh9MhsjogAyugwFUUYE62URQJtG6F0e0hL/_1TfEf5dV1nuPdkU5G8g4Nnnz4Alw5xVa4bJBQaqh2Gg5pSC6rYy0ynu45kXQmT43z1Jc1Bo2V: file name too lo
ng\n\n"]}

Please provide the environment you discovered this bug in

open enterprise agent v1.25.0

Anything else?

This works when using "dev" mode, meaning the vault storage backend is "inmem" (in memory).

The problem seems to be that the secret path of the DID peer is too long, I have not tested this with the production "raft" storage backend, I am only trying to run hashicorp locally with a simple storage engine.

mkbreuningIOHK commented 7 months ago

@yshyn-iohk to triage

yshyn-iohk commented 7 months ago

@robertocarvajal, could you provide your docker-compose file or any other details on how you run the Agent and the Vault?

robertocarvajal commented 7 months ago

sure, the docker vault-server entry looks like this

  vault-server:
    image: hashicorp/vault:latest
    ports:
      - "127.0.0.1:8200:8200"
    environment:
      VAULT_ADDR: ${VAULT_ADDR}
      VAULT_DEV_ROOT_TOKEN_ID: ${VAULT_DEV_ROOT_TOKEN_ID}
    volumes:
      - ./config.hcl:/vault/config/config.hcl:ro
    command: server
    #command: server -dev -dev-root-token-id=${VAULT_DEV_ROOT_TOKEN_ID}
    cap_add:
      - IPC_LOCK
    healthcheck:
      test: ["CMD", "vault", "status"]
      interval: 20s
      timeout: 5s
      retries: 10

basically I'm passing a custom config to setup my persistent storage vault, the config.hcl looks like this

ui = true

listener "tcp" {
  tls_disable = 1
  address = "[::]:8200"
  cluster_address = "[::]:8201"
  # Enable unauthenticated metrics access (necessary for Prometheus Operator)
  #telemetry {
  #  unauthenticated_metrics_access = "true"
  #}
}

#storage "inmem" {}

storage "file" {
  path = "/vault/data"
}

# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
#seal "gcpckms" {
#   project     = "vault-helm-dev"
#   region      = "global"
#   key_ring    = "vault-helm-unseal-kr"
#   crypto_key  = "vault-helm-unseal-key"
#}

# Example configuration for enabling Prometheus metrics in your config.
#telemetry {
#  prometheus_retention_time = "30s"
#  disable_hostname = true
#}

secrets {
  enable = true
}

path "secret/*" {
  backend = "kv"
  version = 2
}

When I create the agent from start, I first connect to the vault docker, init the vault, then unseal it, and then create the secrets kv store.

$ docker exec -i -t issuer-vault-server-1 /bin/sh
# vault operator init
... copy/backup the vault token and unseal keys ...
# export VAULT_TOKEN=_vault_token_here_
# vault operator unseal
... unseal is done 3 times until the vault is unsealed ...
# vault secrets enable -version=2 -path=secret kv

Then I stop the agent, set the VAULT_TOKEN on the agent, start the agent and again, unseal the vault again. Once the vault is unsealed and the agent has the VAULT_TOKEN it will create the default wallet seed, that's all great and it means it's working. The problem arises when I try to connect to the agent and it needs to create a DID peer.

patlo-iog commented 7 months ago

Hi @robertocarvajal, thanks for raising this.

I tested with the provided configuration and it is indeed a problem. Currently, the logical path for the did:peer is defined here. The physical path seems inflated a bit and the filesystem backend is not happy with the convention we're using. We're considering moving to a fixed-length hash for some path segment (especially the DID string) to alleviate the issue on some backend / OS. It has some implication on the policy configuration, but not a show stopper.

In the meantime, SECRET_STORAGE_BACKEND=postgres can be used for local development. Or if you build from source and want to use vault, you can update this line to this, just to get around it while we're on it.

s"secret/${walletId.toUUID}/dids/peer/${did.value.take(20)}/keys/$keyId"

Any suggestions are welcome :smile:

robertocarvajal commented 7 months ago

@patlo-iog thank you for looking into this!

Yeah the physical path is longer because Hashicorp seems to encrypts the path in the vault resulting in an even longer filename and goes over the general 255 chars limit on common linux filesystems.. I guess first I'll try specialized filesystems and mount a volume that supports very long filenames, I'm not familiar enough with the internal code of the agent to suggest a robust solution over there unless is a mix between DB and hashicorp, like using a uuid for the DID peer path that you can query using the DID peer string on the DB as a lookup, then read the keys from hashicorp, that's my naive solution without really knowing the internals of the agent :)

patlo-iog commented 7 months ago

Hey @robertocarvajal , this change should solve it https://github.com/hyperledger-labs/open-enterprise-agent/pull/918. Could you try the new version 1.30.1 with the option VAULT_USE_SEMANTIC_PATH=false. It basically configure the path structure, so you might not be able to read old data. But it does allow backend with limited path length.

I also tested with raft storage, it allows pretty long secret path, so I figure it's best to let the users pick what is right for them. docs here

hyperledger / identus-cloud-agent