I am a newbie, I try to setup one requester and one Compute Node following Create Network, but come out the above error message from Compute Node side.
Expected Behavior
Requester(s) and Compute Node(s) can communicate.
Steps to Reproduce
i have TWO linux ubuntu servers in a same network.
install bacalhau (1.5.1) in both
setup token
run Requester and Compute Node
Requester is running smooth, Compute Node comes out the above error msg.
For your more information: i setup a node as Requester and Compute in same computer, then it works, i can submit the job to run.
So, i may be communication issue between two nodes.
Bacalhau Versions
Agent Version: Run bacalhau agent version to get this.
Bacalhau v1.5.1
BuildDate 2024-10-28 06:10:18 +0000 UTC
GitCommit 2c4963f6c16e68834fe4ec4a8f68f5e8f2417ae1
CLI Client Version: Run bacalhau version for the client info.
CLIENT SERVER LATEST UPDATE MESSAGE
v1.5.1 v1.5.1 1.5.1
Host Environment
Provide details about the environment where the bug occurred:
Operating System:
ubuntu 18.04
ubuntu 24.04
CPU Architecture:
Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
Intel(R) Xeon(R) w5-3425
Any other relevant environment details:
Job Specification
(If applicable, provide the job spec used when the issue occurred.)
To connect to this node from the local client, run the following commands in your shell:
export BACALHAU_API_HOST=127.0.0.1
export BACALHAU_API_PORT=1234
A copy of these variables have been written to: /home/easystore/.bacalhau/bacalhau.run
00:54:02.637 | WRN pkg/orchestrator/planner/logging_planner.go:48 > Job failed [Details:{"IsError":"true","NodesAvailable":"0","NodesRequested":"1","NodesSuitable":"0"}] [EvalID:ba18a0d8-cc76-4d71-8ebe-c885652934b5] [Event:"not enough nodes to run job. requested: 1, available: 0, suitable: 0."] [JobID:j-fdaebcd3-2a71-40e1-b9e9-52b601e56ae0] [NodeID:n-adc8ea97]
To connect to this node from the local client, run the following commands in your shell:
export BACALHAU_API_HOST=127.0.0.1
export BACALHAU_API_PORT=1234
A copy of these variables have been written to: /home/easystore/.bacalhau/bacalhau.run
11:04:56.146 | ERR pkg/compute/management_client.go:117 > failed to send update info to requester node error="failed to get nodestate during node registration: nodeInfo not found for nodeID: n-389eb261-e61b-47cf-91f1-a621e198cd25" [NodeID:n-389eb261]
Bug Description
I am a newbie, I try to setup one requester and one Compute Node following Create Network, but come out the above error message from Compute Node side.
Expected Behavior
Requester(s) and Compute Node(s) can communicate.
Steps to Reproduce
Bacalhau Versions
Agent Version: Run
bacalhau agent version
to get this. Bacalhau v1.5.1 BuildDate 2024-10-28 06:10:18 +0000 UTC GitCommit 2c4963f6c16e68834fe4ec4a8f68f5e8f2417ae1CLI Client Version: Run
bacalhau version
for the client info. CLIENT SERVER LATEST UPDATE MESSAGE v1.5.1 v1.5.1 1.5.1Host Environment
Provide details about the environment where the bug occurred:
Job Specification
(If applicable, provide the job spec used when the issue occurred.)
Logs
Requester Logs:
bacalhau serve --orchestrator 00:45:53.786 | INF cmd/cli/serve/serve.go:103 > Config loaded from: [/home/easystore/.bacalhau/config.yaml], and with data-dir /home/easystore/.bacalhau 00:45:53.787 | INF cmd/cli/serve/serve.go:181 > Starting bacalhau... 00:45:54.835 | INF cmd/cli/serve/serve.go:256 > bacalhau node running [address:0.0.0.0:1234] [compute_enabled:false] [name:n-adc8ea97-fcbc-4efb-9ccc-f23040349a7d] [orchestrator_address:0.0.0.0:4222] [orchestrator_enabled:true] [webui_enabled:false]
To connect to this node from the local client, run the following commands in your shell: export BACALHAU_API_HOST=127.0.0.1 export BACALHAU_API_PORT=1234
A copy of these variables have been written to: /home/easystore/.bacalhau/bacalhau.run 00:54:02.637 | WRN pkg/orchestrator/planner/logging_planner.go:48 > Job failed [Details:{"IsError":"true","NodesAvailable":"0","NodesRequested":"1","NodesSuitable":"0"}] [EvalID:ba18a0d8-cc76-4d71-8ebe-c885652934b5] [Event:"not enough nodes to run job. requested: 1, available: 0, suitable: 0."] [JobID:j-fdaebcd3-2a71-40e1-b9e9-52b601e56ae0] [NodeID:n-adc8ea97]
Compute Node Logs:
bacalhau serve --compute --config Compute.Orchestrators=192.168.1.58 11:03:55.117 | INF cmd/cli/serve/serve.go:103 > Config loaded from: [/home/easystore/.bacalhau/config.yaml], and with data-dir /home/easystore/.bacalhau 11:03:55.117 | INF cmd/cli/serve/serve.go:181 > Starting bacalhau... 11:03:56.157 | INF cmd/cli/serve/serve.go:256 > bacalhau node running [address:0.0.0.0:1234] [capacity:"{CPU: 16.80, Memory: 94 GB, Disk: 671 GB, GPU: 0}"] [compute_enabled:true] [engines:["docker","wasm"]] [name:n-389eb261-e61b-47cf-91f1-a621e198cd25] [orchestrator_enabled:false] [orchestrators:["192.168.1.58"]] [publishers:["local","noop"]] [storages:["urldownload","inline"]] [webui_enabled:false]
To connect to this node from the local client, run the following commands in your shell: export BACALHAU_API_HOST=127.0.0.1 export BACALHAU_API_PORT=1234
A copy of these variables have been written to: /home/easystore/.bacalhau/bacalhau.run 11:04:56.146 | ERR pkg/compute/management_client.go:117 > failed to send update info to requester node error="failed to get nodestate during node registration: nodeInfo not found for nodeID: n-389eb261-e61b-47cf-91f1-a621e198cd25" [NodeID:n-389eb261]