AbsaOSS / spline

Data Lineage Tracking And Visualization Solution
https://absaoss.github.io/spline/
Apache License 2.0
587 stars 154 forks source link

400 Arango error on inserting to executionPlan #1300

Closed wajda closed 1 week ago

wajda commented 5 months ago

Discussed in https://github.com/AbsaOSS/spline/discussions/1299

Originally posted by **vishalag001** January 25, 2024 Hi @wajda , I am getting the following error on inserting to executionPlan. Looks like the transaction is unable to update the counter to _rev attribute mismatch. There are too many inserts happening parallely. Can we ignore the _rev match validation while updating or is there anything else that I am missing ? How to resolve this issue? Appreciate your help and guidance. Thanks in advance ! ``` Service "/spline" encountered error 400 while handling POST vst://:::8529/_db/spline/spline/execution-plans via ArangoError: conflict, _rev values do not match at t.TxManagerImpl.nextTxNumber (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:201:3710) at t.TxManagerImpl.startWrite (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:201:3871) at t.SubscribableTxManagerDecorator.startWrite (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:201:2745) at /arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:71:1045 at t.withTimeTracking (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:209:422) at t.storeExecutionPlan (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:71:981) at Route._handler (/arangodb/data/coordinator8529/apps/_db/spline/spline/APP/index.js:1:5279) at next (/arangodb/data/coordinator8529/data/js/server/modules/@arangodb/foxx/router/tree.js:419:15) at next (/arangodb/data/coordinator8529/data/js/server/modules/@arangodb/foxx/router/tree.js:417:7) at next (/arangodb/data/coordinator8529/data/js/server/modules/@arangodb/foxx/router/tree.js:417:7) ```
wajda commented 5 months ago

@vishalag001

Oh, there must be a really high concurrency if that happens :)

Can we ignore the _rev match validation

No, the _rev validation is essential. The nextTxNumber() function is supposed to behave as increment and get. The issue is that for an unknown reason it is missing a loop that should repeat update attempts until it either succeeds or exceeds the max attempts threshold, and only then it should fail. Since you are running a development version I assume you have the source code and know how to build it. Can you please try to modify the nextTxNumber() function in the tx-manager-impl.ts to something like the below and test it again? That would be helpful as I don't have a highly concurrent setup for a decent test to experiment with.

Try to play with it and when it works, create a pull-request please.

    private nextTxNumber(): TxNum {
        let attempts: number = 10 // Max attempts to atomically increment the counter
        while (attempts-- > 0) {
            try {
                const curCnt: Counter = store.getDocByKey(CollectionName.Counter, 'tx')

                // as of time of writing the '@types/arangodb:3.5.13' was the latest version,
                // and it was not up-to-date with ArangoDB 3.10+ JavaScript API
                // @ts-ignore
                const updResult: UpdateResult<Counter> = db._update(
                    curCnt,
                    {
                        curVal: curCnt.curVal + 1
                    },
                    // @ts-ignore
                    {
                        overwrite: false, // check _rev
                        returnNew: true   // return an updated document in the `new` attribute
                    }
                )

                const newCnt: Counter = updResult.new

                return newCnt.curVal
            }
            catch (e) {
                if (e['errorNum'] !== 1200) throw e
            }
        }
    }