Seagate / halon

High availability solution
Apache License 2.0
1 stars 0 forks source link

st: Wait for m0t1fs to mount only on m0t1fs client(s) #1511

Closed 1468ca0b-2a64-4fb4-8e52-ea5806644b4c closed 5 years ago

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 5 years ago

Created by: vvv

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 5 years ago

Created by: vvv

@chumakd Could we be hitting Jenkins' h0 run-st timeout limit? Recent addition of new ST must have increased run-st execution time.

<< t_stop-reverts-sdev-states >>
[...]
+ [24] hctl mero stop
Stopping cluster.
Cluster stop initiated.
Process{0x7200000000000001:0x26}: PSStopping -> PSOffline
Service{0x7300000000000001:0x27}: SSStopping -> SSOffline
Progress: 3.85% -> 11.54%
Process{0x7200000000000001:0x1e}: PSStopping -> PSOffline
Service{0x7300000000000001:0x22}: SSStopping -> SSOffline
Service{0x7300000000000001:0x1f}: SSStopping -> SSOffline
Service{0x7300000000000001:0x25}: SSStopping -> SSOffline
Service{0x7300000000000001:0x23}: SSStopping -> SSOffline
Service{0x7300000000000001:0x21}: SSStopping -> SSOffline
Service{0x7300000000000001:0x20}: SSStopping -> SSOffline
Service{0x7300000000000001:0x24}: SSStopping -> SSOffline
Progress: 11.54% -> 42.31%
Process{0x7200000000000001:0x1b}: PSStopping -> PSOffline
Service{0x7300000000000001:0x1c}: SSStopping -> SSOffline
Service{0x7300000000000001:0x1d}: SSStopping -> SSOffline
Progress: 42.31% -> 53.85%
Cluster stop failed: StopProcessesOnNodeFailed (Node nid://192.168.223.202:9070:0) "halon:m0d service stop timed out"
12.37user 31.32system 5:41.16elapsed 12%CPU (0avgtext+0avgdata 60392maxresident)k
1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 5 years ago

Created by: vvv

Jenkins CI is unhappy about the addition of .local suffix:

+ [cluster_bootstrap:44] _wait_for_m0t1fs mero_single_02_test-domain.local
+ [_wait_for_m0t1fs:108] [[ 1 > 0 ]]
+ [_wait_for_m0t1fs:109] local steps_left=30
+ [_wait_for_m0t1fs:110] set +x
Waiting for m0t1fs to mount...mero_single_02_test-domain: ssh: Could not resolve hostname mero_single_02_test-domain.local: Name or service not known

pdsh@mero_single_02_test-domain: mero_single_02_test-domain: ssh exited with exit code 255
.mero_single_02_test-domain: ssh: Could not resolve hostname mero_single_02_test-domain.local: Name or service not known