codenotary / immugw

Apache License 2.0
23 stars 4 forks source link

Multi-Database `data is corrupted` issue #21

Closed nweave03 closed 1 year ago

nweave03 commented 1 year ago

What happened When I am running multiple database verified_set, verified_get, and zadd simultaneously, eventually immugw gets into a state where it always returns 409 with {'code': 10, 'message': 'data is corrupted'}

The only fix at this point is to restart immugw.

It appears to be related to state management in immugw.

Immudb is not reporting any errors.

What you expected to happen

I expected it to keep working as i switch between multiple databases for the set, get, and zadd operations.

How to reproduce it (as minimally and precisely as possible)

Need to have the following: 1 immudb instance with multiple immudbes inside it (i used 3) 1 immugw instance that handles all 3 dbs 3 clients each trying to access different databases through this single instance

Environment immudb is on its own VM, it is run in a docker container as the only thing the vm does. It is version 1.3.2. immudb is - https://hub.docker.com/layers/codenotary/immudb/1.3.2/images/sha256-3614dc7f7a11566e38993ea83b3567d5ea3f2cc9dea42a057be662bb6c2b457c?context=explore

immugw is on its own VM, it is run in a docker container as the only thing the vm does. it is based off the branch distroless-image as master was not building for me. This appears to be based on 1.3.2 (based on commit comments)

Additional info (any other context about the problem)

This was reproduced by @arriqaaq in the support channel: https://github.com/arriqaaq/immugwtest

nweave03 commented 1 year ago

Is there any update on this? is a fix planned?

arriqaaq commented 1 year ago

Hi @nweave03

I'll get back to you shortly on the dates regarding this.

arriqaaq commented 1 year ago

@nweave03 could you let us know if this is high on priority (or a blocker for using immudb) for you atm? We are planning to fix this in the December release

nweave03 commented 1 year ago

@arriqaaq it is not preventing a prod deployment, we can probably wait until December, but if it isn't fixed by then, we will have to explore alternatives.

arriqaaq commented 1 year ago

@nweave03 we will fix this on priority, please be assured

arriqaaq commented 1 year ago

@nweave03 I've started working on the solution for this. Would it be possible for you to test the RC branch (post completion) to verify before we make a release?

nweave03 commented 1 year ago

Sure can, though I'm off for Thanksgiving, so will check it monday

arriqaaq commented 1 year ago

@nweave03 thank you, it should be ready for test by Monday

arriqaaq commented 1 year ago

@nweave03 Hope you are doing well

There is a breaking change in this implementation, basically the endpoints have changed slight from

/db/x/y -------> /db/{database_name}/x/y

You can find more info on the documentation here

There are a bunch of handler route still to be changed, so will require a around 1-2 days. Just wanted to keep you updated.

arriqaaq commented 1 year ago

APIs look to be working with the new format, so it should be ready for test. This is the PR https://github.com/codenotary/immugw/pull/25

Branch: feat/multidb-handler

Meanwhile, I will be adding more tests, so it should not be a blocker for your testing.

nweave03 commented 1 year ago

@arriqaaq Got it, i will start programming the changes. I will do local testing first, to make sure my changes to the api calls are working before I deploy this to the testing environment. Will keep you in the loop.

arriqaaq commented 1 year ago

Thank you @nweave03, please feel free to let me know if you find any difficulty/issues.

arriqaaq commented 1 year ago

In case you have not built yet, please do build from the latest commit in this branch

nweave03 commented 1 year ago

@arriqaaq sorry, i tried to ping on discord, but I cannot get the container started.

I'm getting 2022/11/28 16:38:47 ERROR: unable to instantiate client: mkdir state-defaultdb: permission denied

something is off there, i just reverted to the old 1.3.2 containers and they started just fine maybe i screwed up the 1.4.0 build

it doesn't look like the dockerfile changed

okay i tried recreating it, same issue. I've tried pulling the volumes out of the container and mapping them onto the linux file system, it did create folders with permissions root:root as expected, but it did not work. I tried chowning them to 3322:3322 which i usually have to do to get immudb workign that way, no dice, and i tried 3323:3323 which i had to do in th epast to get immugw working that way, also no dice

is this in a different location? my volumes are mapped (in this test) as : volumes:

in order to get these working, it appears immugw creates a user / group

        # of 3323:3323 .  when it creates the following directories, it creates them
        # as systemd-coredump:root, so a chown 3323:3323 of the relevant directories
        # is necessary.  It may take a second (or second docker-compose up)
        # for the issue to resolve (at least it did for me, but it did resolve)
        #       ../data/immugw/
        #       ../data/immugw/data/
        #       ../data/immugw/logs/
        #       ../data/immugw/home/
        - ../data/immugw/data/:/var/lib/immudb
        - ../data/immugw/logs/:/var/log/immugw
        - ../data/immugw/home/:/home/immu/
arriqaaq commented 1 year ago

Thanks for your comment @nweave03, I will check why this is happening

I will sync with you on discord so that it is faster

arriqaaq commented 1 year ago

I have moved to creating the state folders for client in the same directory which immugw owns, previously it was writing to the current directory where the binary is run, so you should not be facing any issue with docker builds. Please feel free to lemme know

arriqaaq commented 1 year ago

Marking this as closed, pre release tested by user

https://github.com/codenotary/immugw/releases/tag/v1.3.0-RC1