timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
16.89k stars 853 forks source link

Backup/Restore schema only #1916

Closed AxelTheGerman closed 1 year ago

AxelTheGerman commented 4 years ago

Hi, I'm new to TimescaleDB and I'm trying to use it in my Rails application. So far almost everything works great (since it's just built on PG) except for running my test suite.

Rails uses pg_dump to dump the schema of your development DB and loads it into your test DB. The important thing is that we only care about the schema and the table schema_migrations. All other data doesn't matter.

Rails SQL dump: something like pg_dump --schema-only --no-privileges --no-owner --file db/structure.sql my_app_development Rails SQL load: psql -v ON_ERROR_STOP=1 -q -X -f db/structure.sql my_app_test

However this only exports my application's schema and not the timescale related schema. See more details here https://stackoverflow.com/questions/61874638/unable-to-use-timescaledb-in-rails-test-environment

Question

https://docs.timescale.com/latest/using-timescaledb/backup shows how to backup/restore via --format=custom - Is there any way to backup and restore the schema for timescaleDB in the more friendly SQL format?

That way I can make it work with Rails. Even wrapping it in SELECT timescaledb_pre_restore(); and SELECT timescaledb_post_restore(); is doable.

Thanks!

k-rus commented 4 years ago

Thank you for the question and request. TimescaleDB maintains internal catalogs, which should be in sync with the schema. Thus dumping just schema is not enough. TimescaleDB doesn't provide a tool to dump the schema state without data. So this issue seems to be a feature request. The current work around is to execute the same SQL DDL statements, which were used to create hypertables in the main database.

AxelTheGerman commented 4 years ago

Thank you for the quick reply! It seems like TimescaleDB is not a popular choice for Ruby on Rails applications judging by the lack of gems as well as this issue with how Rails handles database schemas and migrations. So I understand if this is not a huge priority for the community.

However, if someone with more knowledge could layout some information on the feasibility of this that'd be greatly appreciated.

The main constraints are that we would have to work with a SQL file and not a custom dump format. Even if there was just a way to dump the DDL statements to recreate the hypertables (not sure about other more advanced features of TimescaleDB) into that SQL - even via a function similar to SELECT timescaledb_pre_restore(); that might be good enough.

This is what currently already gets dumped from my applications schema:

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

CREATE EXTENSION IF NOT EXISTS timescaledb WITH SCHEMA public;

SET default_tablespace = '';

SET default_table_access_method = heap;

CREATE TABLE public.events (
    metric_id bigint,
    "time" timestamp without time zone NOT NULL,
    value numeric
);

CREATE INDEX events_time_idx ON public.events USING btree ("time" DESC);

CREATE TRIGGER ts_insert_blocker BEFORE INSERT ON public.events FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker();

-- 
-- More schema stuff - foreign keys, indexes etc
--

--
-- PostgreSQL database dump complete
--

SET search_path TO "$user", public;

INSERT INTO "schema_migrations" (version) VALUES
('20200515043708'),
('20200515043902'),
('20200515053402'),
('20200515040422');

Note how Rails automatically adds the values for performed schema migrations into that table. If there was a way to describe TimescaleDB schema with SQL - even by inserting data - that'd be great!

bilby91 commented 2 years ago

We are also very interested in this feature.

As a workaround, we have patched ActiveRecord so that it dumps custom SQL statements (create_hypertable calls) to an independent structure.sql file that we then load when we need to restore the schema.

Is there any place I can dig in the timescale src in order to understand more about this?

Thanks in advance!

jonatas commented 1 year ago

I'm the creator of the timescaledb gem. We have now a hook for the schema dumper to accomplish with this integration. If you have any questions or feedback, feel free to open an issue in the gem as well.

jonatas commented 1 year ago

I closed the issue thinking that it resolves the issue for the Ruby ecosystem, feel free to reopen if you're thinking on a more language/framework agnostic solution.

stalkerg commented 4 weeks ago

@jonatas can you reopen this ticket? We need a CLI tool (native, C/Rust...) or for Python ecosystem.

jonatas commented 4 weeks ago

Hey @stalkerg, my comment offering expired 😄 Just kidding, I can reopen but this is a quite old issue. I'd encourage you to lead a new discussion and create a new one. Make reference of the actual one but go straight for your ideas, your context and why that would be great as part of the core extension. The question is, do we really need it as part of the core extension?

I also see it more as a tooling that depends on the ecosystem that you're using and would recommend you to take a look on the actual integration we already have with python and maybe help us there to make the python library great as the ruby covering all the ground at this point.

@jamessewell is doing the python work with django here not sure if it's on your stack, but also a possible check.

stalkerg commented 4 weeks ago

I'd encourage you to lead a new discussion and create a new one. Make reference of the actual one but go straight for your ideas, your context and why that would be great as part of the core extension.

thank, I will do.

The question is, do we really need it as part of the core extension?

yes

I also see it more as a tooling that depends on the ecosystem that you're using and would recommend you to take a look on the actual integration we already have with python and maybe help us there to make the python library great as the ruby covering all the ground at this point.

it's can be good for developers but not for DevOps. Also we are using aiohttp/asyncpg (or sanic or tornado) directly, maybe with Alembic migrations. Honestly framework tooling not really work for Python because Django have no same dominate position as RoR in Ruby (and it's true for other languages)

jonatas commented 4 weeks ago

Thanks for the details! Looking forward to check the new issue 🚀