apache / accumulo

Apache Accumulo
https://accumulo.apache.org
Apache License 2.0
1.07k stars 445 forks source link

Add integration test that ensures that server process does not start against newer version of accumulo. #4268

Open keith-turner opened 8 months ago

keith-turner commented 8 months ago

If someone with an Accumulo 2.1 cluster does the following it would be good if the last step failed.

  1. Upgrades Accumulo from 2.1 to 3.1
  2. Accidentally starts a 2.1 accumulo process after the upgrade.

It seems like this code should cause the server process started in step 2 above to fail. Having an integration test for this would be useful given how disruptive older code running against new data and metadata could be. The following is an outline of a possible integration test.

  1. Create a 2.1 mini accumulo cluster
  2. Stop all accumulo processes
  3. In the test code, modify the data version in hdfs to add one to whatever is there
  4. Attempt to start each kinds of accumulo server processes and verify that they exit with an error code.
EdColeman commented 8 months ago

Would it be sufficient to have the manager check the expected version? If the manager does not start, then wouldn't everything kind of wait for assignments and never do do anything?

In particular, the GC? I think it would hold (or error) until it could scan the metadata table - which, until the manager is online would remain unenviable?

Elasticity may need additional checks though.

keith-turner commented 8 months ago

Would it be sufficient to have the manager check the expected version? If the manager does not start, then wouldn't everything kind of wait for assignments and never do do anything?

I think its best if all server do this check and I think they currently do. There could be a situation where the 3.1 manager is up and running, 3.1 tservers are up and hosting metadata tables, and a 2.1 tserver is started. Do not want that 2.1 tserver to attempt to participate with the running 3.1 servers.

ArbaazKhan1 commented 6 months ago

I can try taking a look at this