Closed Mythra closed 7 years ago
Hi @SecurityInsanity,
You can definitely have any number of coordinators, not just two (we've routinely run Firmament with up to 40).
The pattern in which you start them up is also correct (modulo the command line parameter being called --parent_uri
, IIRC). Do your machines have different hostnames? Some places in the system use hostnames, so if they have the same default hostname, incorrect information might be displayed. We try to use UUID instead of hostnames, but it's possible that something has slipped through the net.
To debug this, it would be helpful to have the log output from each coordinator, and in particular the output from the one on 10.36.75.73
(which should see the child coordinators register in its output). You can turn on more verbose output by passing --v=1
to the coordinator binaries.
Hey @ms705 ,
Thanks for following up so quickly!
Hmmm I do believe:
tcp:10.36.65.78:8000 (--parent-uri tcp:10.36.75.73:8000 )
tcp:10.36.71.204:8000 (--parent-uri tcp:10.36.75.73:8000 )
Both have the exact hostname, I'll go change those up, and if it still seems fishy I'll go ahead, and send you some verbose logging.
Yep, setting unique hostnames seemed to work @ms705 . Thanks much, and great work :wave:
I notice in the documentation you've written:
Yet you've also written:
It seems to imply here that we can create multiple coordinators (such as 3), yet also seems to imply we can only have two coordinators running.
Starting three schedulers in the pattern:
tcp:10.36.75.73:8000 (no parent uri) tcp:10.36.65.78:8000 (--parent-uri tcp:10.36.75.73:8000 ) tcp:10.36.71.204:8000 (--parent-uri tcp:10.36.75.73:8000 )
Yet the topology map shows only two hosts. I'm assuming this is because this is due to the project being "alpha stage" as mentioned, and totally understandable. Just want to make sure I'm not going crazy.