Lilypad-Tech / lilypad

Run AI workloads easily in a decentralized GPU network. https://www.youtube.com/watch?v=yQnB2Yxia4Y
https://lilypad.tech
Apache License 2.0
51 stars 16 forks source link

NodeID Changes When Container is Recreated (Same EVM) #339

Open Rodebrechtd opened 2 months ago

Rodebrechtd commented 2 months ago

Describe the bug

I would like to report a bug regarding NodeID consistency. Currently, the NodeID does not remain the same when the container is recreated. Ideally, the NodeID that was initially assigned should be verified via websocket and the same ID should be reassigned upon container recreation. This issue needs to be addressed to ensure NodeID stability across container recreation

Reproduction

  1. Run the container docker run -d --name lilypad-resource-provider --gpus all -p 1234:1234 -e WEB3_PRIVATE_KEY=CHANGE_ME --restart always ghcr.io/lilypad-tech/resource-provider:latest
  2. Check NodeID: docker logs lilypad-resource-provider | grep NodeID
  3. Remove the container: docker rm -f lilypad-resource-provider
  4. Recreate the container with the same command and params
  5. Check NodeID again

Logs

17:54:30.306 | INF pkg/nats/logger.go:47 > Server is ready [Server:n-62244a7a-fd16-445c-9403-9dd3b51ac9a7]
17:54:30.33 | INF pkg/nats/server.go:48 > NATS server NB63HDGKKDN5EOJBDPZVSRARR5VVOELFBYPUWWZ3RPSBOWVGKIA4PHEJ listening on nats://0.0.0.0:4222 [NodeID:n-62244a7a]
17:54:30.342 | INF pkg/node/heartbeat/server.go:78 > Heartbeat server started [NodeID:n-62244a7a]
17:54:30.342 | INF pkg/node/manager/node_manager.go:65 > Node manager started [NodeID:n-62244a7a]
17:54:30.342 | INF pkg/node/requester.go:86 > Nodes joining the cluster will be assigned approval state: APPROVED [NodeID:n-62244a7a]
17:54:30.343 | ERR pkg/publisher/local/publisher.go:103 > failed to resolve network address by type, using 127.0.0.1 [NodeID:n-62244a7a]

Here I recreate the container with same params, same EVM address

17:57:18.723 | INF pkg/nats/logger.go:47 > Server is ready [Server:n-e398717b-93b2-4df3-9c15-4427c86ab852]
17:57:18.746 | INF pkg/nats/server.go:48 > NATS server NDX5SPA5YTGVYN33B7Z4TLJO74MCS6DMDKON7UDVE6DQA2E6JEUNQC7N listening on nats://0.0.0.0:4222 [NodeID:n-e398717b]
17:57:18.757 | INF pkg/node/heartbeat/server.go:78 > Heartbeat server started [NodeID:n-e398717b]
17:57:18.757 | INF pkg/node/manager/node_manager.go:65 > Node manager started [NodeID:n-e398717b]
17:57:18.757 | INF pkg/node/requester.go:86 > Nodes joining the cluster will be assigned approval state: APPROVED [NodeID:n-e398717b]
17:57:18.758 | ERR pkg/publisher/local/publisher.go:103 > failed to resolve network address by type, using 127.0.0.1 [NodeID:n-e398717b]

Screenshots

No response

System Info

Ubuntu 24

Severity

Annoyance

Rodebrechtd commented 1 month ago

This can be ranked higher in Severity