iotaledger / one-click-tangle

"One Click Tangle" intends to make lives easier to IOTA adopters by providing pre-configured scripts and recipes that allow to deploy IOTA Networks and Nodes "in one click".
MIT License
55 stars 35 forks source link

one-click-tangle broken #85

Open IoTAdri opened 1 year ago

IoTAdri commented 1 year ago

Bug description

when using private-tangle.sh lots of error-msgs occur: WARN[0000] network tangle: network.external.name is deprecated. Please set network.name with external: true

AND

Waiting coordinator bootstrap to stop gracefully... Error response from daemon: No such container: 2385239fcb05553225d6e5238fb94b540dc23b88ea1adabeac7ef83ef36296f1

nodes are not deployed

Docker and docker-compose version

Docker version 23.0.1, build a5ee5b1 Docker Compose version v2.16.0 Portainer 2.17.1

Hardware specification

VPS on Ubuntu 22.04

Steps To reproduce the bug

Explain how the maintainer can reproduce the bug.

  1. just run the one-click-install

Expected behaviour

private tangle with 4 containers should be deployed

Actual behaviour

nodes are not deployed in docker and when I edit docker-compose.yaml and correct the first bug with:

networks:
  tangle:
    name: private-tangle
    external: true

it seems to go wrong in bootstrapCoordinator () like the coo.bootstrap.container is not deployed or something it generates an error and the other nodes (node, spammer, etc.) are not deployed.

Errors

Waiting coordinator bootstrap to stop gracefully... Error response from daemon: No such container: 2385239fcb05553225d6e5238fb94b540dc23b88ea1adabeac7ef83ef36296f1

IoTAdri commented 1 year ago

addendum:

Since I could not solve th above problem I did a workaround and bootsraped the coo outside the bashfile to get it running. now... 3 out of 5 times when I run the "stop"-procedure: ./private-tangle.sh stop the coo will not stop and when i force it after 5 minutes it deletes the coordinator.state With no coordinator.state the start procedure does not work and the coo exits.

I tried to use coo-fix-state: docker-compose run --rm coo tool coo-fix-state --databasePath /app/db --stateFilePath /app/coo-state (with the paths mentioned in the container) but it crashes with: image

this is not a stable situation... HELP!

jmcanterafonseca-iota commented 1 year ago

@IoTAdri please could you execute from scratch./private-tangle.sh install (in the hornet-private-net folder and attach here all the output you get in the console?

IoTAdri commented 1 year ago

Hi Jorge, Thank you for responding to me. My first errors (related to docker) are fixed (I used a version from 10/3 which you corrected already) but.. my second error still exists; coo not stopping corrupting the coordinator.state.

Here is the screenshot when I do the whole install as per your request: ptanglestandardokay everything went smooth, up and running very fast... but after a few hours and starting and stopping a few times (as will happen over a number of months time during deployment) the coo keeps running when you ask it to stop... it will not gracefully exit.. (I can send you the logs also).. it will keep trying to exit for a few minutes (till timeout occurs) and then when the timer is at zero... be killed.. deleting the coordinator.state in the process. privatetangleError

here a shot from the logs: Schermafbeelding 2023-03-27 223550

and when you are trying to start the network again the coo will fail.. I tried too use coo-fix-state but get the above mentioned error (see addendum)

jmcanterafonseca-iota commented 1 year ago

can you try to edit your config-coo.json (located under folder config) and under the entry described below, change the stateFilePath field to point to an absolute path?

"coordinator": {
        "stateFilePath": "<full_path_to your_folder>/coo-state/coordinator.state",

@IoTAdri

IoTAdri commented 1 year ago

@jmcanterafonseca-iota okay, will change the path but first have to reinstall becaus current error is unrecoverable (cannot use coo-fix-state)

but.. is that not the path "inside" the docker container?

image and when I first remove \db (just to be sure) and do an install I get: image

jmcanterafonseca-iota commented 1 year ago

you are right, then it should be /app/coo-state/coordinator.state

jmcanterafonseca-iota commented 1 year ago

@IoTAdri was it fixed?

IoTAdri commented 1 year ago

this seems to work a bit more stable... but after a number of times of stopping and starting the private tangle it gets into trouble again: image and I do not know how to recover from this...

IoTAdri commented 1 year ago

@jmcanterafonseca-iota after a few minutes the "gracefull exit" times out and the coo-container is closed but the coordinator.state is gone... image we used to be able to correct this with coo-fix-state but i cannot get this to work (see above in "addendum")

[I tried renaming the coordinator.state_old to coordinator.state and restart the private tangle but... nope, no luck!]

jmcanterafonseca-iota commented 1 year ago

can you try stopping the Coo manually i.e. through

docker kill --signal="SIGTERM" coo

@IoTAdri

IoTAdri commented 1 year ago

can you try stopping the Coo manually i.e. through

docker kill --signal="SIGTERM" coo

@IoTAdri

@jmcanterafonseca-iota I probably can but..

The problem is not that the coo keeps on running and cannot be stopped but that when it is killed (or timed out by itself in 5 minutes) it deletes the coordinator.state and I cannot reconstruct it because coo-fix-state does not work...

jmcanterafonseca-iota commented 1 year ago

maybe @muXxer can comment on the above

Nilesh0711 commented 1 year ago

WARN[0000] network tangle: network.external.name is deprecated in favor of network.name Waiting for 10 seconds ... ⏳ 2023-05-31T11:54:34Z INFO Coordinator milestone issued (1): ba93f3575086e127994707bb0030e7c12213c9aed1414f9614e78bbbed1ea199 Coordinator bootstrapped! d7203e741b014b46400b3b565780cdc9c2423766920cf59c018efcc557918c44 Waiting coordinator bootstrap to stop gracefully... Error: No such container: d7203e741b014b46400b3b565780cdc9c2423766920cf59c018efcc557918c44

any better solution?