timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.61k stars 883 forks source link

Possible bug: `ERROR: tried calling catalog_get when extension isn't loaded` even though the extension is loaded #1682

Open shirshendubhowmick opened 4 years ago

shirshendubhowmick commented 4 years ago

Relevant system information:

Describe the bug Suddenly started getting error ERROR: tried calling catalog_get when extension isn't loaded while creating hypertable. There was no version upgrades made to the server. NOTE: this issue is only happening on a particular DB, another DB on the same server is working fine

To Reproduce Omiting this as this occured suddenly, not sure how to reproduce

Postgres logs Here is some postgres logs that might be helpful

2020-02-08 16:25:52.546 UTC [21575] LOG:  received fast shutdown request
2020-02-08 16:25:52.548 UTC [21575] LOG:  aborting any active transactions
2020-02-08 16:25:52.548 UTC [21586] LOG:  terminating TimescaleDB job scheduler due to administrator command
2020-02-08 16:25:52.548 UTC [21586] FATAL:  terminating connection due to administrator command
2020-02-08 16:25:52.554 UTC [21582] LOG:  terminating TimescaleDB background worker launcher due to administrator command
2020-02-08 16:25:52.554 UTC [21582] FATAL:  terminating connection due to administrator command
2020-02-08 16:25:52.556 UTC [21575] LOG:  background worker "logical replication launcher" (PID 21583) exited with exit code 1
2020-02-08 16:25:52.556 UTC [21575] LOG:  background worker "TimescaleDB Background Worker Scheduler" (PID 21586) exited with exit code 1
2020-02-08 16:25:52.556 UTC [21575] LOG:  background worker "TimescaleDB Background Worker Launcher" (PID 21582) exited with exit code 1
2020-02-08 16:25:52.556 UTC [21577] LOG:  shutting down
2020-02-08 16:25:52.578 UTC [21575] LOG:  database system is shut down
2020-02-08 16:25:52.742 UTC [25149] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2020-02-08 16:25:52.744 UTC [25149] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-02-08 16:25:52.783 UTC [25150] LOG:  database system was shut down at 2020-02-08 16:25:52 UTC
2020-02-08 16:25:52.792 UTC [25149] LOG:  database system is ready to accept connections
2020-02-08 16:25:52.794 UTC [25156] LOG:  TimescaleDB background worker launcher connected to shared catalogs
2020-02-08 16:25:52.810 UTC [25159] ERROR:  tried calling catalog_get when extension isn't loaded
2020-02-08 16:25:52.813 UTC [25149] LOG:  background worker "TimescaleDB Background Worker Scheduler" (PID 25159) exited with exit code 1
2020-02-08 16:25:53.329 UTC [25161] [unknown]@[unknown] LOG:  incomplete startup packet
2020-02-08 16:26:01.394 UTC [25197] postgres@smallcase_rum_production ERROR:  tried calling catalog_get when extension isn't loaded
2020-02-08 16:26:01.394 UTC [25197] postgres@smallcase_rum_production STATEMENT:  select create_hypertable('test', 'last_updated');

Running a \dx on the problematic DB gives:

                                      List of installed extensions
    Name     | Version |   Schema   |                            Description
-------------+---------+------------+-------------------------------------------------------------------
 plpgsql     | 1.0     | pg_catalog | PL/pgSQL procedural language
 timescaledb | 1.5.1   | public     | Enables scalable inserts and complex queries for time-series data
 uuid-ossp   | 1.1     | public     | generate universally unique identifiers (UUIDs)
(3 rows)

Please let me know whatever additional logs / information required.

Thanks

bboule commented 4 years ago

Hey @shirshendubhowmick thanks for the report, you indicate that you have this running without issue on another database within the same instance of PG and it is running without issue? Just to confirm when you connected to the "new" database you ran CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE; while connected to the new database? I just wanted to confirm this before we marched ahead with additional troubleshooting!!

shirshendubhowmick commented 4 years ago

@bboule Yes another database on the same instance is working perfectly fine.

I tried running CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE; in both the problematic DB and the working DB.

Both the DBs produced same output:

NOTICE:  extension "timescaledb" already exists, skipping
CREATE EXTENSION
shirshendubhowmick commented 4 years ago

Update: upgraded version to 1.6.0 the problem still persists.

shirshendubhowmick commented 4 years ago

Update: Tried a workaround to solve the issue, and it worked. Created a backup of the problematic DB using pg_dump dropped the database. Ran a vacuum. Created a new database, restored data using pg_restore from the backup and it worked.

cevian commented 4 years ago

We've run into this error message before with having timescaledb.restoring on you can double check this with show timescaledb.restoring; which should return off. Otherwise you did not complete backup/restore correctly. We will work on improving the error message.

mkindahl commented 4 years ago

This looks like it is related to #1844, which is caused when timescaledb.restoring is set for the database in question. Try ALTER DATABASE whatever SET timescaledb.restoring = off for the offending database before running pg_dump (or pg_restore).

svenklemm commented 3 years ago

Closing this since there was no activity. Feel free to reopen if the issue is still relevant.

fgielow commented 3 years ago

We've run into this error message before with having timescaledb.restoring on you can double check this with show timescaledb.restoring; which should return off. Otherwise you did not complete backup/restore correctly. We will work on improving the error message.

I've attempted a database migration twice following the steps:

And while no errors have occurred there, the restoring flag is still on. I cannot change that flag since my new DB is on Azure and I do not have permission to do so.

Is there anything I can do to solve this? Note that while the flag is on, I can ingest new data without issues at least on first glance. But my logs report:

2021-05-16 16:49:18 UTC-60a14d0d.10c-ERROR: tried calling catalog_get when extension isn't loaded 2021-05-16 16:49:18 UTC-60a14d0a.24-LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 268) exited with exit code 1 2021-05-16 16:49:18 UTC-60a14d0e.114-ERROR: tried calling catalog_get when extension isn't loaded 2021-05-16 16:49:18 UTC-60a14d0a.24-LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 276) exited with exit code 1 2021-05-16 16:49:18 UTC-60a14d0e.148-ERROR: tried calling catalog_get when extension isn't loaded 2021-05-16 16:49:18 UTC-60a14d0a.24-LOG: background worker "TimescaleDB Background Worker Scheduler" (PID 328) exited with exit code 1

akashagarwal7 commented 3 years ago

It seems like timescaledb_post_restore() is actually setting the value for timescaledb.restoring to 'on':

mydb=# set timescaledb.restoring to 'off';
SET
mydb=# 
mydb=# 
mydb=# show timescaledb.restoring;
 timescaledb.restoring 
-----------------------
 off
(1 row)

mydb=# 
mydb=# 
mydb=# select timescaledb_post_restore();
 timescaledb_post_restore 
--------------------------
 t
(1 row)

mydb=# 
mydb=# 
mydb=# show timescaledb.restoring;
 timescaledb.restoring 
-----------------------
 on
(1 row)

We're using Timescale 2.3.0 with PG 11, Docker.

Edit: I think this is happening under some certain circumstances. I just retried the above, and timescaledb_post_restore() is correctly setting the value for timescaledb.restoring to 'off'.

miohtama commented 2 years ago

I run into this error message while restoring a backup.

2022-03-31 17:58:15.109 CEST [75720] ERROR:  tried calling catalog_get when extension isn't loaded

However if you dig deeper, there is a related entry in TimescaleDB logs, but it was some lines before the final error message:

2022-03-31 17:56:03.187 CEST [63102] PANIC:  could not write to file "pg_logical/replorigin_checkpoint.tmp": No space left on device

Just as a hint for anyone who stumbles upon this on the future.

jnidzwetzki commented 9 months ago

Hi @ilmagher,

Thanks for making us aware of this problem. The error message indicates that the extension is not loaded correctly in your environment. Could you check that the setting timescaledb.disable_load is set to off (SHOW timescaledb.disable_load;). In addition, could you check that the loader library of TimescaleDB is preloaded by PostgreSQL? The PostgreSQL configuration needs to contain a line similar to this:

shared_preload_libraries = 'timescaledb'
jnidzwetzki commented 9 months ago

Hi @ilmagher,

Thanks for getting back. Could you try to create an empty database on that database server and run a create extension timescaledb; in this database? Using this command, you could check if there is a problem specific to the restored database or if there is a more general installation problem with the TimescaleDB extension.

jnidzwetzki commented 9 months ago

Hello @ilmagher,

Thank you for trying this. It means that the TimescaleDB extension works on your system. You mentioned that the problem occurred after you had restored a backup. Which TimescaledB version was installed on the system where you did the backup? Was it also version 2.12.2, or did you create the backup from another TimescaleDB version?

jnidzwetzki commented 9 months ago

Hi @ilmagher,

Thanks for the steps to reproduce the issue. I tried to reproduce it in my local environment, and I was able to backup and restore an instance without any errors. You will find the steps I took attached.

However, as this doesn’t seem to be a bug report but rather a question regarding a specific usecase, I would also recommend asking about this in the community slack or forum, which are better platforms for this type of question and you will get more help there.

# Prepare environment
mkdir /tmp/issue1682
mkdir /tmp/issue1682/backups

docker run --name timescaledb \
  -d \
  --restart unless-stopped \
  --shm-size="128M" \
  -p 5423:5432 \
  -e POSTGRES_PASSWORD=abcd \
  -e POSTGRES_DB=testdb \
  -v /tmp/issue1682/backups:/backups \
  timescale/timescaledb:2.12.2-pg15 \
  -c "temp_file_limit=1GB" \
  -c "log_min_duration_statement=5000" \
  -c "log_temp_files=0" \
  -c "log_statement=none" \
  -c "log_duration=off" \
  -c "log_line_prefix=%m [%p] %q%u@%d " \
  -c "max_connections=100" \
  -c "effective_cache_size=41472MB" \
  -c "checkpoint_completion_target=0.9" \
  -c "wal_buffers=16MB" \
  -c "default_statistics_target=100" \
  -c "random_page_cost=1.1" \
  -c "effective_io_concurrency=200" \
  -c "shared_preload_libraries=pg_prewarm,pg_stat_statements,timescaledb" \
  -c "track_io_timing=on"

# Load extension and insert some dummy data
echo 'CREATE EXTENSION timescaledb;' | docker exec -i timescaledb psql -U postgres
echo 'CREATE TABLE temperature (time timestamptz NOT NULL, value float);' | docker exec -i timescaledb psql -U postgres
echo "SELECT create_hypertable('temperature', 'time');" | docker exec -i timescaledb psql -U postgres
echo "INSERT INTO temperature SELECT time, ceil(random() * 100)::int FROM generate_series('2000-01-01 0:00:00+0'::timestamptz, '2000-01-01 23:59:59+0','1m') time;" | docker exec -i timescaledb psql -U postgres

# Create a backup
docker exec -i timescaledb /bin/bash -c "pg_dump -U postgres -W -F t --no-password postgres > /backups/pg_backup.dump"

# Delete docker instance
docker stop timescaledb
docker rm timescaledb

# Setup a new instance
docker run --name timescaledb \
  -d \
  --restart unless-stopped \
  --shm-size="128M" \
  -p 5423:5432 \
  -e POSTGRES_PASSWORD=abcd \
  -e POSTGRES_DB=testdb \
  -v /tmp/issue1682/backups:/backups \
  timescale/timescaledb:2.12.2-pg15 \
  -c "temp_file_limit=1GB" \
  -c "log_min_duration_statement=5000" \
  -c "log_temp_files=0" \
  -c "log_statement=none" \
  -c "log_duration=off" \
  -c "log_line_prefix=%m [%p] %q%u@%d " \
  -c "max_connections=100" \
  -c "effective_cache_size=41472MB" \
  -c "checkpoint_completion_target=0.9" \
  -c "wal_buffers=16MB" \
  -c "default_statistics_target=100" \
  -c "random_page_cost=1.1" \
  -c "effective_io_concurrency=200" \
  -c "shared_preload_libraries=pg_prewarm,pg_stat_statements,timescaledb" \
  -c "track_io_timing=on"

# Restore the data
echo 'SELECT timescaledb_pre_restore();' | docker exec -i timescaledb psql -U postgres
docker exec -i timescaledb /bin/bash -c "pg_restore -U postgres -d postgres --no-password < /backups/pg_backup.dump"
echo 'SELECT timescaledb_post_restore();' | docker exec -i timescaledb psql -U postgres

# Check that restored data is there and TimescaleDB works
echo 'SELECT * from temperature' | docker exec -i timescaledb psql -U postgres
jnidzwetzki commented 9 months ago

Hello @ilmagher,

Thanks for getting back. This is correct and missing in my simple example. You should call timescaledb_pre_restore before the backup is imported and timescaledb_post_restore after the restore is complete.

Omitting these calls could work in simple cases like the above. However, for more complex datasets, it is required to invoke these functions. I changed the example above, it now includes these function calls.

jnidzwetzki commented 9 months ago

Hello @ilmagher,

Thanks for getting back. Does the script work in your environment using the simple test data, or do you get the same error?

If this works and only your dataset causes this problem, is it possible that you dump smaller parts of the original dataset (i.e., table per table) and restore it also table per table? This would help to narrow down the problem.

Is there anything specific in the logfile snipped you are worried about? It looks like a background worker has failed (I assume due to the extension problem) and a shutdown of the database server. Do you see any further error messages in the logfile of the database server?

jnidzwetzki commented 9 months ago

Hello @ilmagher,

The error message operation not supported on chunk tables usually occurs when timescaledb_pre_restore() was not executed before the dump was restored. Could you double-check that you executed this command on the destination database server before you restore any data? The SQL statement SHOW timescaledb.restoring; should return on afterward.

Does the destination server contain any data or is /Users/mabarbaro/pg/data empty before you start the PostgreSQL server for the first time?

mkindahl commented 9 months ago

@ilmagher Since we cannot reproduce it but it is generated by your installation, we should look into why the error is printed and if there are any internal settings that are wonky. The error can be printed for a few different reasons, so let's explore this.

First reason is that timescaledb.restoring is set to true, so let's check that. Here is how it looks for me.

mats=# show timescaledb.restoring;
 timescaledb.restoring 
-----------------------
 off
(1 row)

The other case is if the extension state is not valid for some reason, so let's check that. It is stored internally so we can access it by defining a function to read it. Below I read the version from the pg_available_extension_versions table using a procedure.

create procedure install_extension_state() as $body$
declare
  ver text;
begin
   select version into ver
     from pg_available_extension_versions
    where name = 'timescaledb' and installed;

   execute format($$
       create function extension_state() returns text
       as '$libdir/timescaledb-%s', 'ts_extension_get_state'
       language c
   $$, ver);
end
$body$ language plpgsql;

After this, the following should work:

mats=# call install_extension_state() ;
CALL
mats=# select extension_state();
 extension_state 
-----------------
 created
(1 row)

The value should typically be created, but it can be other states and still be correct.

You can drop the function after you're done.

You should also check that the variable timescaledb.update_script_stage is not set to anything:

mats=# show timescaledb.update_script_stage;
ERROR:  unrecognized configuration parameter "timescaledb.update_script_stage"

The error is expected, which just means that you have not done an update (yet). If it shows a value, it would be good to know what value that is.

mkindahl commented 8 months ago

Hello @mkindahl many thanks for your support. I executed the above commands but in my env the only working is the first one. Am I missing any update?

Hmm... it seems that symbol is not compiled in, so no I do not think that you're missing an update. I managed to reproduce it by accident here so maybe I have a reproduction case. In my case, the state is "unknown", which would cause this error.

One thing that caused an issue in my build here was also that I had timescaledb.restoring set for the database, which could be the reason for your issue. Could you run show timescaledb.restoring? It should show as off.

mkindahl commented 8 months ago
# psql -U postgres -c 'SHOW timescaledb.restoring;'
 timescaledb.restoring
-----------------------
 off
(1 row)

Ok, then we know this is not the value of timescaledb.restoring.

Sorry, but I think you have to follow the suggestion above to reinstall the extension for now. We do not have an easy way to "fix" the extension state.

If I am able to create a reliable reproduction I will add information about it here.

mkindahl commented 8 months ago

Hello @mkindahl I tried the suggestion #1682 (comment) to reinstall the extension but got same error n missing extension. Correct sequence should be (after pg_dump) : drop database; vacuum; restore?

Probably restarting the server after the database is dropped as well, otherwise looks correct.

eSlider commented 2 weeks ago

Steps to reproduce(on my machine (v2.16.1) and get fix the subject error:


-- Getting chunks works
SELECT show_chunks('hypertable_name'); 

-- Prepapring for restore
SELECT timescaledb_pre_restore();

-- Throws subject error 
SELECT show_chunks('hypertable_name'); 

-- End up restore mode(I guess)
SELECT timescaledb_post_restore();

-- Getting chunks works again
SELECT show_chunks('hypertable_name')

System

  {
    "postgresql_version": "PostgreSQL 16.4 (Debian 16.4-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit",
    "timescaledb_version": "2.16.1",
    "sysname": "Linux",
    "version": "#202408250733 SMP PREEMPT_DYNAMIC Sun Aug 25 07:54:49 UTC 2024",
    "release": "6.11.0-061100rc5-generic",
    "version_pretty": "Debian GNU/Linux 12 (bookworm)"
  }

[!CAUTION] I'm not expert, so please be careful with my advice.