Open brownleej opened 3 years ago
What's your preferred way to get the signal out? Is it some text or error code from the command line?
I already added a special key range (\xff\xff/management/in_progress_exclusion/, \xff\xff/management/in_progress_exclusion0)
to tell what processes are in progress of excluding, which means the data replication is not finished.
It's trivial to add a new fdbcli interface for this like
excludeInProgress
to print out any processes not finished yet. Is this something helpful here?
That special key-range will be available in 7.0 (I only see it in the release-7.0
branch and not the release-6.3
)? Would it make more sense to read it directly from the database instead of adding a new fdbcli
command?
That special key-range will be available in 7.0 (I only see it in the
release-7.0
branch and not therelease-6.3
)? Would it make more sense to read it directly from the database instead of adding a newfdbcli
command?
yeah, it's only available on 7.0 yeah, we can directly read it. Adding a command for that is just making it easier to remember(or maybe print more help text) if someone cannot remember the key range.
When running the
exclude no_wait
command with a process that is not reporting to the database, the CLI reports a message of the form:WARNING: Missing from cluster! Be sure that you excluded the correct processes before removing them from the cluster!
. It reports the same message whether the address is completely unknown to the database, or whether it is a process that is associated with data that has not been fully re-replicated. This means that we cannot use the output ofexclude no_wait
to determine if the re-replication for that process has completed. This makes it difficult to determine if it is safe to permanently destroy resources associated with a process that is temporarily unavailable. By comparison, the blocking form of theexclude
command will block when a process is in this state, until the data is replicated. I think we should change this behavior to give a clearer signal on processes that are missing but have data, and align the no-wait exclude and the blocking exclude more.