dmwm / WMCore

Core workflow management components for CMS.
Apache License 2.0
46 stars 107 forks source link

Upgrade couchDB to 3.1.x #8853

Closed amaltaro closed 2 years ago

amaltaro commented 5 years ago

Couchdb 1.6.1 is no longer supported, example https://issues.apache.org/jira/browse/COUCHDB-1869

and we should eventually upgrade it to the latest stable couch release (2.3.x). Since I always have issues finding couch documentation, here is the release history/notes: http://docs.couchdb.org/en/stable/whatsnew/index.html

I vaguely remember Antanas saying some unfortunate changes related to couch Futon (or Fauxton) that would be a problem for us, to be investigated though.

vkuznet commented 5 years ago

Alan, the main issue with migration is converting/adapting erlang scripts which do authentication. Last time it took me and Diego quite some time to fix that. But it would be nice if figure out which changes were made to Futon. Valentin

On 0, Alan Malta Rodrigues notifications@github.com wrote:

Couchdb 1.6.1 is no longer supported and we should eventually upgrade it to the latest stable couch release (2.2.x). Since I always have issues finding couch documentation, here is the release history/notes: http://docs.couchdb.org/en/stable/whatsnew/index.html

I vaguely remember Antanas saying some unfortunate change related to couch Futon (or Fauxton) that would be a problem for us, to be investigated though.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmwm/WMCore/issues/8853

amaltaro commented 4 years ago

For the record, I just read about this admin panel that can be integrated to any CouchDB: https://github.com/ermouth/couch-photon

it works in a similar way to Futon, which AFAIK hasn't been made available in Couch 2.3.x

amaltaro commented 4 years ago

And it looks like we will have to first migrate to CouchDB 2.x (or convert our databases to the new format), only then we can migrate to 3.x. Reference: https://docs.couchdb.org/en/stable/install/upgrading.html

amaltaro commented 3 years ago

Status update on this GH issue, which is being done in collaboration with a Summer Student (similar to what was done for the MariaDB issue in https://github.com/dmwm/WMCore/issues/9241).

With some extremely important information gathered by Max, and a priceless help from Shahzad, we managed to build CouchDB 3.1.1 and a new Erlang package (moving away from 1.6.1,), here is the cmsdist PR: https://github.com/cms-sw/cmsdist/pull/7218

with the RPMs created (Py3 WMAgent specs upgraded to 1.5.2.pre3, and updated Requires from couchdb16 to couchdb31), I uploaded them to my cmsrep area: http://cmsrep.cern.ch/cgi-bin/repos/comp.amaltaro/slc7_amd64_gcc630?C=M;O=D

and last but not least, a new wmagentpy3 docker image has been created and uploaded to my gitlab registry: https://gitlab.cern.ch/amaltaro/Docker/container_registry

which can be pulled with docker pull gitlab-registry.cern.ch/amaltaro/docker:wmcorepy3_tests

NOTE: in order to build these private docker images, I had to re-use the following (from MariaDB private packaging):

With the new image, Max is now allowed to proceed with the investigation of what breaks, what needs to be updated to be compliant with the new CouchDB version, and so on.

NOTE 2: gitlab logs are already reporting two possible problems with the WMAgent manage script:

/home/dmwm/unittestdeploy/wmagent/current/config/wmagentpy3/manage: line 505: couchdb: command not found

and

/home/dmwm/unittestdeploy/wmagent/current/config/wmagentpy3/manage: line 520: couchdb: command not found

UPDATE on 30/Aug: some basic - and non-final - changes required to properly start CouchDB within the container: https://github.com/dmwm/deployment/pull/1088

klannon commented 3 years ago

Awesome! Thanks for the update. I look forward to hearing more! 🎉

amaltaro commented 2 years ago

For the record, CouchDB 3.1.2 has just been released (a security release) and we should target at that version.

vkuznet commented 2 years ago

I would like to provide you full working recipe how to setup CouchDB 3.2.0 with auth proxy server (APS) which provides CMS credentials and authentication w/o any modifications to CouchDB itself. I tested this setup using two k8s clusters: cmsweb-test3 and cmsweb-test6 where I installed recent version of CouchDB and APS.

CouchDB setup

The stock CouchDB is available on all major OSes. For my setup I used docker image:

docker pull couchdb
# if you need to run it locally using this image you can start it as following
docker run -p 5984:5984 -d couchdb

The new CouchDB comes with mandatory admin account such that you should visit CouchDB futon to set it up. It can also be set using environment variables.

CouchDB on k8s

I used the following manifest file to deploy CouchDB to k8s infrastructure. Here the exact commands to do that:

# create couchdb namesapce
kubectl create ns couchdb
# deploy couchdb.yaml provided in https://gist.github.com/vkuznet/a17884a46b72215fec2b3340417c1789
kubectl apply -f couchdb.yaml

As you can see from gist it uses COUCHDB_USER and COUCHDB_PASSWORD credentials for authentication. So far I used admin and admin, respectively.

With this setup the new CouchDB is up and running on couchdb namespace

kubectl -n couchdb get pods
NAME                       READY   STATUS    RESTARTS   AGE
couchdb-54c4c75756-5k9qz   1/1     Running   0          43m

It can be accessed within k8s cluster via the following URI:

http://admin:admin@couchdb.couchdb.svc.cluster.local:5984

where couchdb credentials precede its URL.

Setting up auth proxy server

The auth proxy server (APS) is a reverse proxy which provides similar capabilities as apache frontends and support various authentication methods, including OAuth (running on default port 443), X509 (running on port 8443). The APS servers are installed on all cmsweb-testX clusters as we test them to replace our apache FEs. I only adjusted the configuration with the following ingress rules:

...
  "ingress": [
    {
      "path": "/couch/.*",
      "service_url": "http://admin:admin@couchdb.couchdb.svc.cluster.local:5984",
      "old_path": "/couch",
      "new_path": ""
    },
    {
      "path": "/couch",
      "service_url": "http://admin:admin@couchdb.couchdb.svc.cluster.local:5984",
      "old_path": "/couch",
      "new_path": ""
    },
...

The rules reads as following:

At this point, I can access CouchDB which sits behind APS using IAM token.

Obtaining IAM tokens

The new OAuth authentication requires a valid IAM token. The token can be either obtained via host token end-point (which will use CERN SSO) or via oidc-token tools. The former, can be access on cmsweb-testX cluster by visiting the following URL https://cmsweb-test3.cern.ch/token (here I used cmsweb-test3 cluster as an example and it should be replaced with your cluster URL).

Another way to obtain IAM token is to use oidc-token tool. For that you need to setup oidc-agent and oidc tools on your machine. Please follow this guide for details. Once it is set and you register your device, you can obtain token as easily as following

# I setup oidc-agent on vocms0181 VM and claim my device with name vocms0181-test
oidc-token vocms0181-test

Testing full setup

Now, we are ready to test CouchDB APIs, e.g. create DB, inject documents and setup replication, using token based authentication. Please follow this set of instructions:

# obtain valid token
token=`oidc-token vocms0181-test`

# if necessary you may get its info by visiting jwt.io and pasting your token there

# query couch
curl -H "Authorization: bearer $token" https://cmsweb-test3.cern.ch/couch

# create DB
curl -k -X PUT -H "Authorization: bearer $token" https://cmsweb-test3.cern.ch/couch/testdb

# create new document
curl -k -X PUT -H "Authorization: bearer $token" https://cmsweb-test3.cern.ch/couch/testdb/"001" -d '{"foo":1}'

# list all docs
curl -k -H "Authorization: bearer $token" https://cmsweb-test3.cern.ch/couch/testdb/_all_docs

# create replication document, you'll need to replace xxxx with your actual token
cat > replication.json EOF
{ "_id": "replication_job",
    "source": {
       "url": "https://cmsweb-test3.cern.ch/couch/testdb",
       "headers": { "Authorization": "Bearer xxxx" }
    },
    "target": {
       "url": "https://cmsweb-test6.cern.ch/couch/testdb",
       "headers": { "Authorization": "Bearer xxxx" }
    },
    "create_target": true,
    "continuous": false
}
EOF

# setup replication
curl -k -X POST -H "Authorization: bearer $token" -H "Content-Type: application/json" -d@./replication.json https://cmsweb-test3.cern.ch/couch/_replicate

# compare results of two CouchDBs
# first, query cmsweb-test3 and obtain all documents from testdb
curl -k -H "Authorization: bearer $token" https://cmsweb-test3.cern.ch/couch/testdb/_all_docs
{"total_rows":1,"offset":0,"rows":[
{"id":"001","key":"001","value":{"rev":"1-4a7e4ae49c4366eaed8edeaea8f784ad"}}
]}

# now, query test6 and obtain all documents from testdb
curl -k -H "Authorization: bearer $token" https://cmsweb-test6.cern.ch/couch/testdb/_all_docs
{"total_rows":1,"offset":0,"rows":[
{"id":"001","key":"001","value":{"rev":"1-4a7e4ae49c4366eaed8edeaea8f784ad"}}
]}

As you can see, create of databases, data injection and replication works like a charm with token based authentication.

The proposed solution has the following set of benefits in a long run:

vkuznet commented 2 years ago

@amaltaro , @todor-ivanov , @goughes , @klannon I strongly suggest you examine the proposal I outlined about in this ticket as it solves all issues with CouchDB maintenance, including its upgrades, authentication, maintenance on VMs/k8s and move towards token based authentication. I would advise to NOT spend time on patching CouchDB itself and rather move towards this solution. We can discuss separately how to perform upgrade and setup. So far I tested all components which are required for successful operation and I don't see any obstacles with separating CouchDB and authentication. They may and should co-exists as separate stack and will allow us to avoid very hard process of patching CouchDB with CMS custom authentication layers and completely eliminate CouchDB authentication maintenance in a long run, and use stock version of CouchDB in a future.

amaltaro commented 2 years ago

Just to update this issue. We had a discussion today and we decide to explore both options, CMS-custom setup and vanilla CouchDB+token. In order to reduce the amount of major changes being integrated into the system, we decided that we will continue with the CMS-custom setup - specs/builds/jenkins is already available - and once we are comfortable using CouchDB 3.x, we will start addressing the CouchDB as a service and token-based authentication.

amaltaro commented 2 years ago

I had some upgrade notes in my local editor, so I decided to share them in the following wiki page: https://github.com/dmwm/WMCore/wiki/CouchDB-Diagram#relevant-changes-in-couchdb-3x

which needs to be well considered for this upgrade.