Closed dianabarsan closed 1 year ago
I've added the code for this temporarily in a repo: https://github.com/medic/couchdb-migration I'm keeping bulk of it in a pul request to ease code review.
This is ready for AT. The code is available in the repo linked above.
We have put together documentation for users to follow to achieve this migration. This can be found in this PR: https://github.com/medic/cht-docs/pull/866 Since we want both the documentation and the software to be correct and easy to use, please follow the steps in the documentation to AT this migration container.
There are a couple of test cases that should be covered:
Additionally, it would be helpful if we could assess the quality of the instructions for users that might be hosting on AWS without using medic-os (if there are such cases), who might have CouchDb data saved in some type of AWS storage volumes.
Ideally, we will improve the documentation to such quality that migrating is easy. Feedback is very welcome!
Thanks!
Hi @dianabarsan, this is still a work in progress, but I would like to write my findings when testing this. Thank you so much for this work. It is going to be super valuable.
For the scenario of "migrating from medi-os to single node 4.x" I was successful only with an online user using the chrome web app, changing the port to 443. With the two offline users I was connected using the phone to in the previous 3.x instance, I couldn't sync. It is because of the URL to connect to the instance. The ports differ from 3.x medic-os instance with port 8443 to 4.x instance with port 443. I tried changing the port to 8443 on 4.x cht-core.yml but could not make this work. For this it would be helpful to be prepare in advance.
I couldn't complete the scenario of "migrating from medic-os to multi-node 4.x" successfully. I will keep working on this.
Other documentations suggestions we can discuss:
COUCH_URL
and CHT_NETWORK
must be set in two places. Before 2. Prepare CHT-Core 3.x installation for upgrading
and again before 5. Launch 4.x CouchDB installation
. Can we add this information to the documentation? Or to explain where you better use a clean env variables environment., not mix the two. I know @ngaruko is also working on this. He may have more suggestions. cc: @andrablaj
Thanks a lot for the feedback @lorerod .
It is not only a "data migration" set of instructions.
It's supposed to only cover data migration, though. The data migration does indeed require that no further changes are made in the data. What title would you suggest?
It would be interesting to have some rollback instructions in case the happy path doesn't work. To go back to your 3.x instance. It would give me more confidence. Do you think this could work?
I think the backup of data that we instruct to save should be enough of a "rollback". Do you think that would suffice?
Env variables COUCH_URL and CHT_NETWORK must be set in two places
Are you referring to the environment variables that you need to pass to CouchDb and to the migration tool?
@dianabarsan
CHT_NETWORK
is not set, the user get a generic error network cht-net declared as external, but could not be found
. So for this one, beside mentioning that it is required (seems it is), we could also suggest ways to find it: docker network ls
or otherwise.Error while getting membership
Error when getting config FetchError: request to http://medic:password@localhost:5984/_membership failed, reason: connect ECONNREFUSED 127.0.0.1:5984
at ClientRequest.<anonymous> (/app/node_modules/node-fetch/lib/index.js:1461:11)
at ClientRequest.emit (node:events:513:28)
at Socket.socketErrorListener (node:_http_client:494:9)
at Socket.emit (node:events:513:28)
at emitErrorNT (node:internal/streams/destroy:157:8)
at emitErrorCloseNT (node:internal/streams/destroy:122:3)
at processTicksAndRejections (node:internal/process/task_queues:83:21) {
type: 'system',
errno: 'ECONNREFUSED',
code: 'ECONNREFUSED'
}
An unexpected error occurred Error: Error when getting config
at Object.getConfig (/app/src/utils.js:181:11)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async getEnv (/app/src/get-env.js:5:18)
at async /app/bin/get-env.js:7:5
Thanks for the feedback @ngaruko
For both 1 and 2:
localhost
is scoped to the docker container. not the host machine. For 1, your suggestion of checking docker network ls
is good, how do you suggest we instruct further if there are multiple networks?
For 2, do you think changing the example of the URL that needs to be passed would help?
@dianabarsan
it's supposed to only cover data migration, though. The data migration does indeed require that no further changes are made in the data. What title would you suggest?
I will suggest maybe something like this:
Main title: Migration from CHT 3.x to CHT 4.x
Subtitle: Guide to migrate existent data from CHT 3.x to CHT 4.x
Some observations before point 1: "This guide will present the required steps while using a migration helping tool, called couchdb-migration
" and add "by the end of this guide you will end up with you CHT-Core 3.x instance down and CHT-Core 4.x ready to be used."
I think the backup of data that we instruct to save should be enough of a "rollback". Do you think that would suffice?
I think we can add some basics instructions of how to get your CHT 3.x up again with the backup data, or if there is some related documentations to link. Maybe in the same Some observations before point 1: we can add: "If you encounter any problems executing the instructions of this guide, you can always get your CHT 3X instance up again with the backup data. See link for further instructions." Or put the instructions in the same guide at the end. Please let me know if you think this is too much.
Are you referring to the environment variables that you need to pass to CouchDb and to the migration tool?
I´m referring to the environment variables that I need you pass to the migration tool.
Thanks a lot for the feedback @lorerod . I'll include it in the docs PR.
@dianabarsan I saw that we have a new version for couchdb-migration. Can we continue testing this? cc: @ngaruko
@lorerod correct, there is a new version that fixes some interactions between the migration software and self-hosted medic-os. If you continue testing, please use the new version. @mrjones-plip volunteered to help with testing as well.,
Environment: MacOS 13.1 (22C65); Docker desktop 4.15.0 (93002)l; Docker engine: 20.10.21
CHT 3.17: Local using docker helper script from master
branch
config: standard
data: upload data from scalability test
CHT 4.1.0: Local using docker compose files cht-core.yml and cht-couchdb.yml
couchdb-migration branch: main
cht-docs branch: 828-4.x-upgrade
Phones:
@dianabarsan I was able to migrate successfully, but this is still a work in progress. I will continue with the migration of 3.17 medicos to 4.1 clustered. This scenario is ok. I left some comments highlighted with an ⚠️ icon. Please let me know what you think.
Environment: Same as in previous comment with the difference in the couchdb compose file: cht-couchdb-clustered.yml
@dianabarsan I wasn't able to migrate successfully. Am I missing something? @ngaruko Did you try this also? Can you share your experience?
Hi @lorerod
Thanks a lot for the extensive detail that you've provided! This is of great value!
I tried using
(in my case couchdb.1) in COUCH_URL as in the doc, but checking couchdb is up. I got:
It seems you have unlocked yourself, what URL did you end up using?
Started CHT-Core 4.1 using curl -s -o ./docker-compose.yml https://staging.dev.medicmobile.org/_couch/builds_4/medic:medic:4.1.0/docker-compose/cht-core.yml and COUCHDB_SERVERS=couchdb.1,couchdb.2,couchdb.3 docker-compose up and I got this errors on the log:
Can you please try:
My guess is that there is some docker network mismatch and haproxy can't reach the pre-existent CouchDb. Please make sure that you're using the same environment variables when you start 4.1 CouchDb along with the other services.
It seems you have unlocked yourself, what URL did you end up using?
@dianabarsan I end up using COUCH_URL=http://medic:password@couchdb-cluster-couchdb.1-1:5984
Can you please try:
- stopping the previous CouchDb build, that you used for the migration
- start CHT 4.1 using both docker-compose files
@dianabarsan I stopped the previous CouchDb build, and start CHT4.1 using:
COUCHDB_SERVERS=couchdb.1,couchdb.2,couchdb.3 docker-compose -f cht-couchdb-clustered.yml -f cht-core.yml up -d
The result is not successful. I need two pairs of fresh eyes for this. Running out of ideas :)
cc: @andrablaj
Hi @lorerod
I'll try to replicate this locally.
Hi @lorerod I've replicated the issue locally, I'll be back soon with a fix.
Hi @lorerod
I've made an update to the documentation to add an additional step to the migration for clustered.
It involves running an additional command after the move-shards
command.
For some reason, my haproxy process had trouble starting up every time, but it restarted itself automatically after 1 minute.
rsyslog startup failure, child did not respond within startup timeout (60 seconds)
This doesn't have anything to do with the migration, but it can be a nuisance.
Could you please try to integrate the node deletion step and wait for 1 minute after starting the 4.1 instance to check if it works?
Thanks a lot!
Hi @dianabarsan It worked! I will leave the details tomorrow but I wanted to let you know.
Environment: Same as in previous comment with the difference in the couchdb compose file: cht-couchdb-clustered.yml
@dianabarsan I was able to migrate successfully!
I have merged the documentation PR. Closing this as completed.
Create a container that contains all the necessary scripts, exposed as commands, that edit Couchdb node and database metadata, to facilitate data migration from a 3.x instance to a 4.x instance.
The scripts should cover: