broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 88 forks source link

Migrating and uploading #1440

Closed graham1034 closed 3 years ago

graham1034 commented 4 years ago

Hello,

We are trying to migrate a seqr database from one server to another. I have saved the database as recommended with

$ pg_dump -U postgres seqrdb | gzip -c - > backup.gz

and installed seqr on a different machine as described in LOCAL_INSTALL.md. (So, on the new server, seqr is within docker, which I am not very familiar with. The older server had a old-school command-line installation)

I've ftp'ed backup.gz across but how do I load it into the new instance? The commands in seqr/deploy/MIGRATE.md (python -m manage makemigrations ... etc) don't seem to be updated to the new containerized environment.

I have tried starting a shell within docker but I can't see backup.gz.

Also, for uploading new samples, the example you give in https://github.com/macarthur-lab/seqr/blob/master/deploy/LOCAL_INSTALL.md#annotating-and-loading-vcf-callsets---option-2-annotate-and-load-on-prem shows annotation and upload to elasticsearch in a single command, python3 -m seqr_loading SeqrMTToESTask ... However we will want to annotate on a different machine and just upload the annotated vcf's on the seqr server. What would be the commands for separate annotation and upload?

Any help very gratefully received

Best regards,

Graham


Dr Graham R Smith

Experimental Scientific Officer Bioinformatics Support Unit Faculty of Medical Sciences Framlington Place Newcastle University Newcastle NE2 4HH U. K.

hanars commented 4 years ago

Hi Graham,

the migration instructions assume you are redeploying seqr in place and that the database doesn't get reset entirely (the reason for running the backup is just in case anything goes wrong :/). Since you are moving from non-docker to docker, you will need to restore the backup to the new db before running the manage migrate commands:

psql -U postgres postgres -c "drop database seqrdb"
psql -U postgres postgres -c "create database seqrdb"
psql -U postgres seqrdb <  <(gunzip -c backup.gz)

For the pipeline, I don't think we support annotating and uploading separately anymore. I'm going to loop in @bw2 as he worked on the dockerized pipeline and may have some ideas

graham1034 commented 4 years ago

Hi Hanars,

Thanks for your helpful reply. I haven't got round to testing what you suggest yet, as I've had to prioritize other things, but I wanted to acknowledge your suggestions. If annotating and uploading must be done together, that suggests that the new server (luckily a VM!) will need upgrading before we can proceed so we have enough CPU and disk space.