MolSSI / QCFractal

A distributed compute and database platform for quantum chemistry.
https://molssi.github.io/QCFractal/
BSD 3-Clause "New" or "Revised" License
144 stars 47 forks source link

Cannot restore database #697

Open lilyminium opened 2 years ago

lilyminium commented 2 years ago

Describe the bug

I would like to be able to move my databases around, but I can't restore them due to different roles/usernames on each computer.

I also can't restore template databases, due to the schema already existing.

To Reproduce

>>> server = FractalSnowflakeHandler()
>>> server._storage.psql.restore_database("database.psql")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/rv/j6lbln6j0kvb5svxj8wflc400000gn/T/ipykernel_77308/1281549261.py in <module>
----> 1 server._storage.psql.restore_database("database.psql")

~/anaconda3/envs/snowflake/lib/python3.7/site-packages/qcfractal/postgres_harness.py in restore_database(self, filename)
    442         if ret["retcode"] != 0:
    443             self.logger(ret["stderr"])
--> 444             raise ValueError("\nFailed to restore the database.\n")
    445 
    446     def database_size(self) -> str:

ValueError: 
Failed to restore the database.

If I print out stderr:

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 664; 1247 16387 TYPE compressionenum lilyw7
pg_restore: error: could not execute query: ERROR:  type "compressionenum" already exists
Command was: CREATE TYPE public.compressionenum AS ENUM (
    'none',
    'gzip',
    'bzip2',
    'lzma'
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TYPE public.compressionenum OWNER TO lilyw7;

pg_restore: from TOC entry 676; 1247 16422 TYPE driverenum lilyw7
pg_restore: error: could not execute query: ERROR:  type "driverenum" already exists
Command was: CREATE TYPE public.driverenum AS ENUM (
    'energy',
    'gradient',
    'hessian',
    'properties'
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TYPE public.driverenum OWNER TO lilyw7;

pg_restore: from TOC entry 670; 1247 16406 TYPE managerstatusenum lilyw7
pg_restore: error: could not execute query: ERROR:  type "managerstatusenum" already exists
Command was: CREATE TYPE public.managerstatusenum AS ENUM (
    'active',
    'inactive'
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TYPE public.managerstatusenum OWNER TO lilyw7;

pg_restore: from TOC entry 673; 1247 16412 TYPE recordstatusenum lilyw7
pg_restore: error: could not execute query: ERROR:  type "recordstatusenum" already exists
Command was: CREATE TYPE public.recordstatusenum AS ENUM (
    'complete',
    'incomplete',
    'running',
    'error'
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TYPE public.recordstatusenum OWNER TO lilyw7;

pg_restore: from TOC entry 667; 1247 16396 TYPE taskstatusenum lilyw7
pg_restore: error: could not execute query: ERROR:  type "taskstatusenum" already exists
Command was: CREATE TYPE public.taskstatusenum AS ENUM (
    'running',
    'waiting',
    'error',
    'complete'
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TYPE public.taskstatusenum OWNER TO lilyw7;

pg_restore: from TOC entry 203; 1259 16446 TABLE access_log lilyw7
pg_restore: error: could not execute query: ERROR:  relation "access_log" already exists
Command was: CREATE TABLE public.access_log (
    id integer NOT NULL,
    access_date timestamp without time zone,
    access_method character varying NOT NULL,
    access_type character varying NOT NULL,
    extra_params character varying,
    ip_address character varying,
    user_agent character varying,
    city character varying,
    country character varying,
    country_code character varying,
    ip_lat character varying,
    ip_long character varying,
    postal_code character varying,
    subdivision character varying
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.access_log OWNER TO lilyw7;

pg_restore: from TOC entry 202; 1259 16444 SEQUENCE access_log_id_seq lilyw7
pg_restore: error: could not execute query: ERROR:  relation "access_log_id_seq" already exists
Command was: CREATE SEQUENCE public.access_log_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.access_log_id_seq OWNER TO lilyw7;

pg_restore: from TOC entry 241; 1259 16884 TABLE alembic_version lilyw7
pg_restore: error: could not execute query: ERROR:  relation "alembic_version" already exists
Command was: CREATE TABLE public.alembic_version (
    version_num character varying(32) NOT NULL
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.alembic_version OWNER TO lilyw7;

pg_restore: from TOC entry 226; 1259 16609 TABLE base_result lilyw7
pg_restore: error: could not execute query: ERROR:  relation "base_result" already exists
Command was: CREATE TABLE public.base_result (
    result_type character varying,
    id integer NOT NULL,
    manager_name character varying,
    hash_index character varying,
    procedure character varying(100) NOT NULL,
    version integer,
    protocols jsonb NOT NULL,
    extras bytea,
    stdout integer,
    stderr integer,
    error integer,
    status public.recordstatusenum NOT NULL,
    created_on timestamp without time zone,
    modified_on timestamp without time zone,
    provenance json
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.base_result OWNER TO lilyw7;

pg_restore: from TOC entry 225; 1259 16607 SEQUENCE base_result_id_seq lilyw7
pg_restore: error: could not execute query: ERROR:  relation "base_result_id_seq" already exists
Command was: CREATE SEQUENCE public.base_result_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.base_result_id_seq OWNER TO lilyw7;

pg_restore: from TOC entry 201; 1259 16433 TABLE collection lilyw7
pg_restore: error: could not execute query: ERROR:  relation "collection" already exists
Command was: CREATE TABLE public.collection (
    id integer NOT NULL,
    collection_type character varying,
    collection character varying(100) NOT NULL,
    lname character varying(100) NOT NULL,
    name character varying(100) NOT NULL,
    tags json,
    tagline character varying,
    description character varying,
    "group" character varying(100) NOT NULL,
    visibility boolean NOT NULL,
    view_url_hdf5 character varying,
    view_url_plaintext character varying,
    view_metadata json,
    view_available boolean NOT NULL,
    provenance json,
    extra json
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.collection OWNER TO lilyw7;

pg_restore: from TOC entry 200; 1259 16431 SEQUENCE collection_id_seq lilyw7
pg_restore: error: could not execute query: ERROR:  relation "collection_id_seq" already exists
Command was: CREATE SEQUENCE public.collection_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.collection_id_seq OWNER TO lilyw7;

pg_restore: from TOC entry 220; 1259 16554 TABLE contributed_values lilyw7
pg_restore: error: could not execute query: ERROR:  relation "contributed_values" already exists
Command was: CREATE TABLE public.contributed_values (
    collection_id integer NOT NULL,
    name character varying NOT NULL,
    "values" bytea NOT NULL,
    index bytea NOT NULL,
    values_structure json NOT NULL,
    theory_level json NOT NULL,
    units character varying NOT NULL,
    theory_level_details json,
    citations json,
    external_url character varying,
    doi character varying,
    comments character varying
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.contributed_values OWNER TO lilyw7;

pg_restore: from TOC entry 221; 1259 16567 TABLE dataset lilyw7
pg_restore: error: could not execute query: ERROR:  relation "dataset" already exists
Command was: CREATE TABLE public.dataset (
    default_benchmark character varying,
    default_keywords json,
    default_driver character varying,
    default_units character varying,
    alias_keywords json,
    default_program character varying,
    history_keys json,
    history json,
    id integer NOT NULL
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.dataset OWNER TO lilyw7;

pg_restore: from TOC entry 227; 1259 16644 TABLE dataset_entry lilyw7
pg_restore: error: could not execute query: ERROR:  relation "dataset_entry" already exists
Command was: CREATE TABLE public.dataset_entry (
    dataset_id integer NOT NULL,
    molecule_id integer NOT NULL,
    name character varying NOT NULL,
    comment character varying,
    local_results json
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.dataset_entry OWNER TO lilyw7;

pg_restore: from TOC entry 238; 1259 16833 TABLE grid_optimization_association lilyw7
pg_restore: error: could not execute query: ERROR:  relation "grid_optimization_association" already exists
Command was: CREATE TABLE public.grid_optimization_association (
    grid_opt_id integer NOT NULL,
    key character varying NOT NULL,
    opt_id integer
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.grid_optimization_association OWNER TO lilyw7;

pg_restore: from TOC entry 235; 1259 16780 TABLE grid_optimization_procedure lilyw7
pg_restore: error: could not execute query: ERROR:  relation "grid_optimization_procedure" already exists
Command was: CREATE TABLE public.grid_optimization_procedure (
    program character varying(100) NOT NULL,
    keywords json,
    qc_spec json,
    id integer NOT NULL,
    initial_molecule integer NOT NULL,
    optimization_spec json,
    starting_molecule integer,
    final_energy_dict json,
    starting_grid json
);

pg_restore: error: could not execute query: ERROR:  role "lilyw7" does not exist
Command was: ALTER TABLE public.grid_optimization_procedure OWNER TO lilyw7;

pg_restore: from TOC entry 213; 1259 16505 TABLE keywords lilyw7
pg_restore: error: could not execute query: ERROR:  relation "keywords" already exists
Command was: CREATE TABLE public.keywords (
    id integer NOT NULL,
    hash_index character varying NOT NULL,
    "values" json,
    lowercase boolean,
    exact_floats boolean,
    comments character varying
);

and so on (truncated).

Expected behavior

I can restore the database.

Additional context

If I add the following options to pg_restore, it works. If the privileges/ownership is a problem, perhaps they could be contingent on keyword arguments that the user can pass in?

https://github.com/MolSSI/QCFractal/blob/867b1bc9c87ae4be37d9656d95a7bd308b7b79a3/qcfractal/postgres_harness.py#L416-L437

    def restore_database(self, filename) -> None:

        # Reasonable check here
        self._check_psql()

        self.create_database(self.config.database.database_name)

        # fmt: off
        cmds = [
            shutil.which("pg_restore"),
            f"--port={self.config.database.port}",
            f"--dbname={self.config.database.database_name}",
+            "--clean",  # drop database objects before recreating
+            "--no-privileges",  # don't restore access privileges
+            "--no-owner",  # don't restore ownership
            filename
        ]
        # fmt: on

        self.logger(f"pg_backup command: {'  '.join(cmds)}")
        ret = self._run(cmds)

        if ret["retcode"] != 0:
            self.logger(ret["stderr"])
            raise ValueError("\nFailed to restore the database.\n")

This makes the full command

['/Users/lily/anaconda3/envs/snowflake/bin/pg_restore', '--port=52198', '--dbname=qcfractal_default', '--clean', '--no-privileges', '--no-owner', 'database.psql', '-h', '127.0.0.1']