LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
976 stars 76 forks source link

VolumeDefinition UUID mismatch. #271

Open kvaps opened 2 years ago

kvaps commented 2 years ago

same test from https://github.com/LINBIT/linstor-server/issues/268#issuecomment-1022694859, but now on lvmthin. Some volumes created Unconnected:

linstor r l | grep pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde
┊ pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde ┊ hf-kubevirt-01 ┊ 7099 ┊ Unused ┊ Connecting(hf-kubevirt-02)  ┊   UpToDate ┊ 2022-01-27 10:01:47 ┊
┊ pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde ┊ hf-kubevirt-02 ┊ 7099 ┊ Unused ┊ Unconnected(hf-kubevirt-01) ┊   UpToDate ┊                     ┊
10:01:43.999 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' created for node 'hf-kubevirt-01'.
10:01:43.999 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' created for node 'hf-kubevirt-02'.
10:01:48.658 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary Resource pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde
10:01:48.658 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary bool set on Resource pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde
10:01:50.384 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' updated for node 'hf-kubevirt-01'.
10:01:50.384 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' updated for node 'hf-kubevirt-02'.
10:03:11.150 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' updated for node 'hf-kubevirt-01'.
10:03:11.150 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' updated for node 'hf-kubevirt-02'.
09:59:18.600 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' created for node 'hf-kubevirt-02'.
10:00:27.316 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary Resource pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde
10:00:27.316 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary bool set on Resource pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde
10:00:40.566 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' created for node 'hf-kubevirt-01'.
10:00:40.566 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde' created for node 'hf-kubevirt-02'.
10:02:10.747 [MainWorkerPool-1] ERROR LINSTOR/Satellite - SYSTEM - VolumeDefinition UUID mismatch. Received '0d2bebfd-ccb7-492d-919a-4cac1edb6c3c' from controller, but had locally '90d3e264-ce35-4324-b72d-f3e8329e3ef9'. Id from local: 'Rsc: 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde', VlmNr: '0'', Id from remote: 'Rsc: 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde', VlmNr: '0''. Dropping connection and resyncing with controller [Report number 61F261CC-C4A8E-000418]
ERROR REPORT 61F261CC-C4A8E-000418

============================================================

Application:                        LINBIT�� LINSTOR
Module:                             Satellite
Version:                            1.17.0
Build ID:                           7e646d83dbbadf1ec066e1bc8b29ae018aff1f66
Build time:                         2021-12-09T07:27:52+00:00
Error time:                         2022-01-27 10:02:10
Node:                               hf-kubevirt-02

============================================================

Reported error:
===============

Category:                           RuntimeException
Class name:                         DivergentUuidsException
Class canonical name:               com.linbit.linstor.core.DivergentUuidsException
Generated at:                       Method 'checkUuid', Source file 'StltRscApiCallHandler.java', Line #959

Error message:                      VolumeDefinition UUID mismatch. Received '0d2bebfd-ccb7-492d-919a-4cac1edb6c3c' from controller, but had locally '90d3e264-ce35-4324-b72d-f3e8329e3ef9'. Id from local: 'Rsc: 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde', VlmNr: '0'', Id from remote: 'Rsc: 'pvc-3bd95628-ff06-4d9f-a15c-3c48a2c3afde', VlmNr: '0''. Dropping connection and resyncing with controller

Call backtrace:

    Method                                   Native Class:Line number
    checkUuid                                N      com.linbit.linstor.core.apicallhandler.StltRscApiCallHandler:959
    checkUuid                                N      com.linbit.linstor.core.apicallhandler.StltRscApiCallHandler:922
    applyChanges                             N      com.linbit.linstor.core.apicallhandler.StltRscApiCallHandler:278
    applyChange                              N      com.linbit.linstor.core.apicallhandler.StltApiCallHandler$ApplyRscData:1259
    applyChangedData                         N      com.linbit.linstor.core.apicallhandler.StltApiCallHandler:746
    applyResourceChanges                     N      com.linbit.linstor.core.apicallhandler.StltApiCallHandler:642
    execute                                  N      com.linbit.linstor.api.protobuf.ApplyRsc:77
    executeNonReactive                       N      com.linbit.linstor.proto.CommonMessageProcessor:525
    lambda$execute$13                        N      com.linbit.linstor.proto.CommonMessageProcessor:500
    doInScope                                N      com.linbit.linstor.core.apicallhandler.ScopeRunner:147
    lambda$fluxInScope$0                     N      com.linbit.linstor.core.apicallhandler.ScopeRunner:75
    call                                     N      reactor.core.publisher.MonoCallable:91
    trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:126
    subscribeOrReturn                        N      reactor.core.publisher.MonoFlatMapMany:49
    subscribe                                N      reactor.core.publisher.Flux:8343
    onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:188
    request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2344
    onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:134
    subscribe                                N      reactor.core.publisher.MonoCurrentContext:35
    subscribe                                N      reactor.core.publisher.InternalFluxOperator:62
    subscribe                                N      reactor.core.publisher.FluxDefer:54
    subscribe                                N      reactor.core.publisher.Flux:8357
    onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:418
    drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:414
    drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:679
    onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:243
    drainFused                               N      reactor.core.publisher.UnicastProcessor:286
    drain                                    N      reactor.core.publisher.UnicastProcessor:329
    onNext                                   N      reactor.core.publisher.UnicastProcessor:408
    next                                     N      reactor.core.publisher.FluxCreate$IgnoreSink:618
    next                                     N      reactor.core.publisher.FluxCreate$SerializedSink:153
    processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:373
    doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:218
    lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:164
    onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:177
    runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:439
    run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:526
    call                                     N      reactor.core.scheduler.WorkerTask:84
    call                                     N      reactor.core.scheduler.WorkerTask:37
    run                                      N      java.util.concurrent.FutureTask:264
    run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
    runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1128
    run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:628
    run                                      N      java.lang.Thread:829

END OF ERROR REPORT.
kvaps commented 2 years ago

Another one:

linstor r l | grep pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa
| pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa | hf-kubevirt-01 | 7098 | Unused | Connecting(hf-kubevirt-02) |   UpToDate | 2022-01-27 10:02:29 |
| pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa | hf-kubevirt-02 | 7098 |        |                            |    Unknown |                     |

hf-kubevirt-01 logs:

10:00:24.657 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' created for node 'hf-kubevirt-01'.
10:01:40.228 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' created for node 'hf-kubevirt-01'.
10:01:40.228 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' created for node 'hf-kubevirt-02'.
10:01:41.364 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-01'.
10:01:41.364 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-02'.
10:01:48.659 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-01'.
10:01:48.659 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-02'.
10:01:51.757 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-01'.
10:01:51.757 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-02'.
10:02:28.445 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' created for node 'hf-kubevirt-01'.
10:02:29.466 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary Resource pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa
10:02:29.466 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Primary bool set on Resource pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa
10:02:29.468 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-01'.
10:02:31.710 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' created for node 'hf-kubevirt-02'.
10:02:31.710 [MainWorkerPool-1] INFO  LINSTOR/Satellite - SYSTEM - Resource 'pvc-8d44d3a7-5edc-4ef5-85b9-433a3e0d4faa' updated for node 'hf-kubevirt-01'.

hf-kubevirt-02 logs are clean