gantree-io / gantree-lib-nodejs

Javascript lib for gantree-cli
Apache License 2.0
12 stars 3 forks source link

TASK [validator-key-insert : grab keys] fails with localhost port 9933: Connection refused #28

Closed brenzi closed 4 years ago

brenzi commented 4 years ago

I'm trying to deploy https://github.com/encointer/encointer-node

with config

{
    "project": "encointer testnet",
    "repository": {
        "url": "https://github.com/encointer/encointer-node",
        "binaryName": "encointer-node"
    },
    "validators": {
        "loggingFilter": "runtime=debug,txpool=debug",
        "telemetry": true,
        "nodes": [
            {
                "provider": "do",
                "machineType": "s-2vcpu-4gb",
                "count": 3,
                "zone": "fra1",
                "sshUser": "root"
            }
        ]
    }
}

building works, but here it fails:

PLAY [validator] ***************************************************************

TASK [Gathering Facts] *********************************************************

ok: [142.93.99.163]

ok: [157.230.97.230]

ok: [157.230.97.156]

TASK [validator-key-insert : ensure curl] **************************************

ok: [157.230.97.230]

ok: [142.93.99.163]

ok: [157.230.97.156]

TASK [validator-key-insert : grab keys] ****************************************

failed: [142.93.99.163] (item=aura) => {"ansible_loop_var": "item", "changed": true, "cmd": "if [ aura = gran ]; then\n  crypto=\"ed25519\"\nelse\n  crypto=\"sr25519\"\nfi\ninspect_result=$(cat /home/subuser/mnemonic | xargs --null /usr/local/bin/subkey --${crypto} inspect)\npublic_key=$(echo -n \"${inspect_result}\" | grep \"Public key\" | cut -d':' -f2 | tr -d '[:space:]')\nmnemonic=$(cat /home/subuser/mnemonic)\n\nprintf \"${public_key}\"\nprintf \"${mnemonic}\"\n\ncurl http://localhost:9933 -H \"Content-Type:application/json\" -d  \"{ \\\n    \\\"jsonrpc\\\":\\\"2.0\\\", \\\n    \\\"id\\\":1, \\\n    \\\"method\\\":\\\"author_insertKey\\\", \\\n    \\\"params\\\": [ \\\n      \\\"aura\\\", \\\n      \\\"${mnemonic}\\\", \\\n      \\\"${public_key}\\\" \\\n    ] \\\n  }\"\n", "delta": "0:00:00.102492", "end": "2020-03-08 18:16:19.227393", "item": "aura", "msg": "non-zero return code", "rc": 7, "start": "2020-03-08 18:16:19.124901", "stderr": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 9933: Connection refused", "stderr_lines": ["  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current", "                                 Dload  Upload   Total   Spent    Left  Speed", "", "  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 9933: Connection refused"], "stdout": "0x54d6fd80e3482fe6f04cecd78e59018b4560e67eb204fe3d74c763310d320727live attend rude blast camp scan identify provide lift rookie wisdom recycle", "stdout_lines": ["0x54d6fd80e3482fe6f04cecd78e59018b4560e67eb204fe3d74c763310d320727live attend rude blast camp scan identify provide lift rookie wisdom recycle"]}

...

PLAY RECAP *********************************************************************

142.93.99.163              : ok=54   changed=23   unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
157.230.97.156             : ok=26   changed=11   unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
157.230.97.230             : ok=26   changed=11   unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
localhost                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

[Gantree] Execution failed with code 2: ansible-playbook main.yml -f 30 -i "/root/.config/gantree-cli/build/encointer testnet/ansible/inventory"
[Gantree] Could not sync application: 2
morelazers commented 4 years ago

Hey @brenzi - try removing the space from the project name (make it encointer-testnet). You will first have to clean your infrastructure otherwise the ssh keys will conflict on DigitalOcean.

This should work for now - I'll ensure that this is patched so as not to allow names with spaces in future.

brenzi commented 4 years ago

didn't help. Still the same issue

morelazers commented 4 years ago

In this case I might have patched something already which is allowing it to succeed on my latest version and not the release.

I'll ping you when we have a release ready to test - otherwise you will have to try the latest dev with a config file which looks like this (note the slight namespace changes):

{
  "project": "encointer-testnet",
  "binary": {
    "repository": "https://github.com/encointer/encointer-node",
    "name": "encointer-node"
  },
  "validators": {
    "loggingFilter": "runtime=debug,txpool=debug",
    "telemetry": true,
    "nodes": [
      {
        "provider": "do",
        "machineType": "s-2vcpu-4gb",
        "count": 3,
        "zone": "fra1",
        "sshUser": "root"
      }
    ]
  }
}
morelazers commented 4 years ago

The other way to get a crash report for me to look into would be to ssh into the machine which is failing and run:

journalctl --unit substrate.service --lines 500 -f

Then see if you can find the relevant lines which say why the substrate binary is not executing.

DrTexx commented 4 years ago

@brenzi are you still experiencing this issue with the latest release?

If so, please provide us with output from the node as mentioned by @morelazers above.

This issue occurs when Ansible cannot communicate with the deployed binary as the binary has crashed.

DrTexx commented 4 years ago

@brenzi do you still have this issue in the latest version? At the time of this comment the latest version is v0.6.6.

You can check both your cli + lib version at once with the cli (gantree-cli --version)

$ gantree-cli --version
gantree-cli 0.8.4
⮡ gantree-lib 0.6.6