LINBIT / linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.
https://docs.linbit.com/docs/linstor-guide/
GNU General Public License v3.0
945 stars 75 forks source link

Attempt to replace an active transMgr #410

Open benedikt-bartscher opened 3 months ago

benedikt-bartscher commented 3 months ago

Deploying a new node

uname -a
Linux redacted 6.1.0-21-arm64 #1 SMP Debian 6.1.90-1 (2024-05-03) aarch64 GNU/Linux

fails with this error report:

ERROR REPORT 6648D334-8F914-000366

============================================================

Application:                        LINBIT? LINSTOR
Module:                             Satellite
Version:                            1.27.1
Build ID:                           c6f8ceed9d50da2c4d37ae8ce20d09daf3046464
Build time:                         2024-04-25T11:12:13+00:00
Error time:                         2024-05-18 16:33:37
Node:                               redacted
Thread:                             MainWorkerPool-3
Access context information

Identity:                           PUBLIC
Role:                               PUBLIC
Domain:                             PUBLIC

Peer:                               redacted:49700

============================================================

Reported error:
===============

Category:                           Error
Class name:                         ImplementationError
Class canonical name:               com.linbit.ImplementationError
Generated at:                       Method 'setConnection', Source file 'AbsTransactionObject.java', Line #62

Error message:                      attempt to replace an active transMgr

Error context:
        Unhandled error executing API call 'RequestVlmAllocated'.
Asynchronous stage backtrace:

    Error has been observed at the following site(s):
        *__checkpoint ? Query volume allocated capacity
        *__checkpoint ? Fallback error handling wrapper
    Original Stack Trace:

Call backtrace:

    Method                                   Native Class:Line number
    setConnection                            N      com.linbit.linstor.transaction.AbsTransactionObject:62

Suppressed exception 1 of 1:
===============
Category:                           RuntimeException
Class name:                         OnAssemblyException
Class canonical name:               reactor.core.publisher.FluxOnAssembly.OnAssemblyException
Generated at:                       Method 'setConnection', Source file 'AbsTransactionObject.java', Line #62

Error message:
Error has been observed at the following site(s):
    *__checkpoint ? Query volume allocated capacity
    *__checkpoint ? Fallback error handling wrapper
Original Stack Trace:

Error context:
        Unhandled error executing API call 'RequestVlmAllocated'.
Call backtrace:

    Method                                   Native Class:Line number
    setConnection                            N      com.linbit.linstor.transaction.AbsTransactionObject:62
    activateTransMgr                         N      com.linbit.linstor.transaction.AbsTransactionObject:159
    set                                      N      com.linbit.linstor.transaction.TransactionSimpleObject:47
    setAllocatedSize                         N      com.linbit.linstor.storage.data.AbsVlmData:115
    updateInfo                               N      com.linbit.linstor.layer.storage.file.FileProvider:267
    updateStates                             N      com.linbit.linstor.layer.storage.file.FileProvider:174
    updateVolumeAndSnapshotStates            N      com.linbit.linstor.layer.storage.AbsStorageProvider:219
    prepare                                  N      com.linbit.linstor.layer.storage.AbsStorageProvider:205
    getVlmAllocatedCapacities                N      com.linbit.linstor.core.apicallhandler.StltApiCallHandlerUtils:207
    executeInScope                           N      com.linbit.linstor.api.protobuf.ReqVlmAllocated:84
    lambda$executeReactive$0                 N      com.linbit.linstor.api.protobuf.ReqVlmAllocated:69
    doInScope                                N      com.linbit.linstor.core.apicallhandler.ScopeRunner:149
    lambda$fluxInScope$0                     N      com.linbit.linstor.core.apicallhandler.ScopeRunner:76
    call                                     N      reactor.core.publisher.MonoCallable:72
    trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:127
    subscribeOrReturn                        N      reactor.core.publisher.MonoFlatMapMany:49
    subscribe                                N      reactor.core.publisher.Flux:8759
    onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:195
    request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2545
    onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:141
    subscribe                                N      reactor.core.publisher.MonoJust:55
    subscribe                                N      reactor.core.publisher.MonoDeferContextual:55
    subscribe                                N      reactor.core.publisher.Flux:8773
    trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:200
    subscribeOrReturn                        N      reactor.core.publisher.MonoFlatMapMany:49
    subscribe                                N      reactor.core.publisher.Flux:8759
    onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:427
    slowPath                                 N      reactor.core.publisher.FluxArray$ArraySubscription:127
    request                                  N      reactor.core.publisher.FluxArray$ArraySubscription:100
    onSubscribe                              N      reactor.core.publisher.FluxFlatMap$FlatMapMain:371
    subscribe                                N      reactor.core.publisher.FluxMerge:70
    subscribe                                N      reactor.core.publisher.Flux:8773
    onComplete                               N      reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber:258
    subscribe                                N      reactor.core.publisher.FluxConcatArray:78
    subscribe                                N      reactor.core.publisher.InternalFluxOperator:62
    subscribe                                N      reactor.core.publisher.FluxDefer:54
    subscribe                                N      reactor.core.publisher.Flux:8773
    onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:427
    drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:453
    drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:724
    onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:256
    drainFused                               N      reactor.core.publisher.SinkManyUnicast:319
    drain                                    N      reactor.core.publisher.SinkManyUnicast:362
    tryEmitNext                              N      reactor.core.publisher.SinkManyUnicast:237
    tryEmitNext                              N      reactor.core.publisher.SinkManySerialized:100
    processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:415
    doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:227
    lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:164
    onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:185
    runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:440
    run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:527
    call                                     N      reactor.core.scheduler.WorkerTask:84
    call                                     N      reactor.core.scheduler.WorkerTask:37
    run                                      N      java.util.concurrent.FutureTask:264
    run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
    runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1136
    run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:635
    run                                      N      java.lang.Thread:840

END OF ERROR REPORT.
ghernadi commented 3 months ago

Sorry for the late response. Does this issue still persist? Did you already try to restart this satellite?

Any way to reproduce this issue?