tarantool / tt

Command-line utility to manage Tarantool applications
Other
101 stars 12 forks source link

tests: fix flacky test_running_base_functionality #992

Closed themilchenko closed 3 weeks ago

themilchenko commented 1 month ago

The integration test test_running_cluster::test_running_base_functionality fails sometimes with an failed assertion of the same PID after restart.

After the patch waiting for instances stop was added to make sure that cluster is down before it will start again and will check for PID.

Closes #972

themilchenko commented 1 month ago

I think it is possible that some instance already restarted and we will wait for stopping of a new PID.

It looks we need to collect active PIDs before restart and wait until it stopped.

Added check for it.

oleg-jukovec commented 1 month ago

I think it is possible that some instance already restarted and we will wait for stopping of a new PID. It looks we need to collect active PIDs before restart and wait until it stopped.

Added check for it.

I think it should looks like:

  1. Collect all instances pids.
  2. Call tt restart.
  3. Wait for the pids.
themilchenko commented 1 month ago

I think it is possible that some instance already restarted and we will wait for stopping of a new PID. It looks we need to collect active PIDs before restart and wait until it stopped.

Added check for it.

I think it should looks like:

1. Collect all instances pids.

2. Call `tt restart`.

3. Wait for the pids.

Now we read a PID file (if it is exists, because it can be already deleted) after restart, check that PID is the same as collected one after first start and if so wait for the instance will stop completely. What am I missing?

oleg-jukovec commented 1 month ago

Now we read a PID file (if it is exists, because it can be already deleted) after restart, check that PID is the same as collected one after first start and if so wait for the instance will stop completely. What am I missing?

Oh, it's my fault. I've missed that we wait for pids from pidByInstanceName that was fetched after start and before restart. In this case we could just wait for pids from pidByInstanceName without additional checks.

themilchenko commented 1 month ago

Now we read a PID file (if it is exists, because it can be already deleted) after restart, check that PID is the same as collected one after first start and if so wait for the instance will stop completely. What am I missing?

Oh, it's my fault. I've missed that we wait for pids from pidByInstanceName that was fetched after start and before restart. In this case we could just wait for pids from pidByInstanceName without additional checks.

Done.