Closed gaurav closed 1 month ago
It might be possible to import an RDB file into an existing Redis cluster by running:
rdb --command protocol /data/dump.rdb | redis-cli --pipe -h 10.0.1.90 -p 6379
(h/t https://www.velotio.com/engineering-blog/installing-redis-cluster-on-mesosphere-dcos via @YaphetKG)
We should check to see if this is significantly faster than the JSON loader we currently use.
This has been substantially addressed in PR https://github.com/helxplatform/translator-devops/pull/768. I'll close this once we successfully deploy this to ITRB and know exactly how long that takes.
So this does work pretty well -- on RENCI this takes about an hour to do the load, plus download time (15 mins or so). On ITRB, however, the Redis databases are in a cluster, redis-cli --pipe
doesn't support cluster mode (https://github.com/redis/redis/issues/6098 and https://github.com/redis/redis/issues/6294), and we can't read the list of clusters when creating the nn-loader job. So we have to do the hacky thing of having a single pod assigned to a particular database (e.g. db1, which has three shards), and then looping over it three times to load each shard individually. We're experimenting with single-shard single-node Redis databases, which would eliminate this issue. But if we can't pull this off, I'll track improving this over on translator-devops (https://github.com/helxplatform/translator-devops/issues/813). We have reduced downtime during ITRB Prod load (and have discussed the other options with NCATS), so I'll go ahead and close this.
NodeNorm ITRB Prod currently takes around eight hours to load, which results in significant downtime to UI Prod.
Probably the simplest way to do this would be to have multiple ITRB Prod instances, one of which can be loaded while the other is being used.