Closed sivaramsk closed 3 years ago
@sivaramsk are all the resources like vault-reviewer, namespace? Most likely this seems that the vault-auth needs to be recreated which is not happening because the items are there on Vault.
I tested to confirm whether the secrets are getting created, I do see vault related secrets getting created after I run deploy-network.yaml
π [23-Oct-2020 08:06:49 AM ] β― kg secrets -n org2-net
NAME TYPE DATA AGE
default-token-4qk6q kubernetes.io/service-account-token 3 6m6s
regcred kubernetes.io/dockerconfigjson 1 3m4s
vault-auth-token-spcmh kubernetes.io/service-account-token 3 6m6s
vault-reviewer-token-rqwg6 kubernetes.io/service-account-token 3 6m5s
I did another test,
Hard for me to explain the problem, but I will try. The secrets gets created like I described above, and the ca-server has the same issue like above.
siva@MacBook in ~/projects/go/src/andromeda-2 on ο master via π default took 6s
π [23-Oct-2020 11:22:17 AM ] β― kg secrets -n org1-net
NAME TYPE DATA AGE
azure-storage-account-f0f56c83d25474823ae035c-secret Opaque 2 4m9s
azure-storage-account-f1eb3c9678bdb40448e4631-secret Opaque 2 4m3s
default-token-b5jnd kubernetes.io/service-account-token 3 4m39s
regcred kubernetes.io/dockerconfigjson 1 36s
vault-auth-token-rn4zt kubernetes.io/service-account-token 3 4m39s
vault-reviewer-token-tpnms kubernetes.io/service-account-token 3 4m39s
Every pod that was running in the 1st-cluster, gets started in the 2nd-cluster at the same time at some point in the deploy network each throwing an error. Very hard to explain what I see there.
I think we will have to delete the auth-path from Vault before running the deploy-network again. Because as per following: the REVIEWER_TOKEN is regenerated, as per the secret, but this command is not run if the auth-path already exists.
@sownak - I don't understand the below path
"vault write auth/{{ auth_path }}/config token_reviewer_jwt="$REVIEWER_TOKEN""
Where in the vault is auth/? The "vault secrets list" command gives me the below output, I don't see a auth under that list
π [23-Oct-2020 03:38:43 PM ] β― vault secrets list
Path Type Accessor Description
---- ---- -------- -----------
cubbyhole/ cubbyhole cubbyhole_11a9cedc per-token private secret storage
identity/ identity identity_1fcdca0b identity store
secret/ kv kv_d1ec59c3 n/a
sys/ system system_cb16fbf3 system endpoints used for control, policy and debugging
Can you clarify how to delete this token?
@sownak - I can confirm once I deleted auth-path in the vault, the ca pods came up and the orderers and peers also came up.
Few observations:
TASK [create/crypto/peer : Copy msp cacerts from auto-generated path to given path] ******************************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Could not find or access './build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem'\nSearched in:\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/tasks/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/hyperledger-fabric/configuration/roles/create/crypto/peer/tasks/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/shared/configuration/../../hyperledger-fabric/configuration/files/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem\n\t/home/blockchain-automation-framework/platforms/shared/configuration/../../hyperledger-fabric/configuration/./build/crypto-config/peerOrganizations/org1-net/peers/peer0.org1-net/msp/cacerts/ca-org1-net-7054.pem on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}
PLAY RECAP *** localhost : ok=309 changed=99 unreachable=0 failed=1 skipped=435 rescued=0 ignored=0
I am not sure whether I am testing the right thing here. What I am trying to confirm is, say if I lose kubernetes cluster which runs 1 or 2 organizations or everything, how do I recover a BAF cluster?
In the current method we deploy the fabric in kubernetes, I was able to sucessfully recover the network using velero backup with a bit of manaul wrangling.
Closing this ticket as the CA had actually come up. I am going to open a specific ticket to discuss BAF DR.
Describe the bug As part of a DR testing, I tried to simulate a lost organization and tried to deploy the organization again, but the CA did not come up.
To Reproduce Steps to reproduce the behavior:
Expected behavior CA and the peer node of the org2 is expected to come up and join the network
Screenshots But the CA pod has issues coming up with the below error
Environment (please complete the following information):
Additional context This test is part of my conversation in the rocket chat - https://chat.hyperledger.org/channel/blockchain-automation-framework?msg=epvnAJNwXR7YNEqgv