Open AntoniJakubiak opened 7 months ago
Dear @AntoniJakubiak , Thank you for your question about retention policies in InfluxDB FDW. Currently, InfluxDB FDW does not support custom retention policies for InfluxDB v1 due to limitations in the cxx_client library's GetV1() function, which connects to the database without specifying a retention policy. We're open to suggestions and would appreciate any thoughts you might have on this or other features.
Hi,
I am sure, that the retention policies could be implemented without the change in influx cxx client library.
Important queries are:
SELECT * FROM rp.measurement;
SHOW FIELD KEYS ON db FROM rp.measurement;
SHOW TAG KEYS ON db FROM rp.measurement;
Basic implementation idea starts here:
static struct InfluxDBFdwOption valid_options[] =
{
///...
{"table", ForeignTableRelationId},
{"retention_policy", ForeignTableRelationId}, // new line
};
This will allow you to define another options
for foreign tables
, for example:
influx=# \det+ "30d".cpu ;
List of foreign tables
Schema | Table | Server | FDW options | Description
--------+-------+--------------+----------------------------------------------------------+-------------
30d | cpu | influxdb_svr | ("table" 'cpu', retention_policy '30d', tags 'cpu,host') |
Reading retention_policy
name from options could look like:
influxdb_get_table_name(Relation rel)
{
//
foreach(lc, table->options)
{
DefElem *def = (DefElem *) lfirst(lc);
if (strcmp(def->defname, "table") == 0)
rptn.table_name = defGetString(def);
// new lines + read retenion_policy name from options
if (strcmp(def->defname, "retention_policy") == 0)
rptn.retention_policy = defGetString(def);
}
Having options, there is a need to modify measurement name. It's quite simply:
static void
influxdb_deparse_relation(StringInfo buf, Relation rel)
{
... = influxdb_get_table_name(rel);
appendStringInfo(buf, "%s.%s", retention_policy, table_name);
}
There is also a need to update import schema.
/*
* Import a foreign schema
*/
static List *
influxdbImportForeignSchema(ImportForeignSchemaStmt *stmt,
Oid serverOid)
{
// ...
#ifndef CXX_CLIENT
// ...
#else
options->svr_retention_policy = stmt->remote_schema; // new line
server = GetForeignServer(serverOid);
mapping = GetUserMapping(GetUserId(), server->serverid);
ret = InfluxDBSchemaInfo(mapping, options);
#endif
// ...
appendStringInfo(&buf, "\n) SERVER %s\nOPTIONS (table ",
quote_identifier(stmt->server_name));
influxdb_deparse_string_literal(&buf, info[table_idx].measurement);
// 2 new lines
appendStringInfo(&buf, ", retention_policy ");
influxdb_deparse_string_literal(&buf, stmt->remote_schema);
}
/*
* InfluxDBSchemaInfo
* Returns information of table if success
*/
extern "C" struct InfluxDBSchemaInfo_return
InfluxDBSchemaInfo(UserMapping *user, influxdb_opt *opts)
{
// ...
char *db = opts->svr_database;
char *rp = opts->svr_retention_policy; // new line
/*
* Get tagkeys for each foreign table
*/
query = "SHOW TAG KEYS ON " + influxdb_quote_identifier(std::string(db)) + " FROM " + influxdb_quote_identifier(std::string(rp), tables_info[table_idx].measurement); // changed line
// ...
/*
* Get fieldkeys for each foreign table
*/
query = "SHOW FIELD KEYS ON " + influxdb_quote_identifier(std::string(db)) + " FROM " + influxdb_quote_identifier(std::string(rp), tables_info[table_idx].measurement); // changed line
// ...
}
Kind regards Antoni Jakubiak
@AntoniJakubiak Thank you for your suggestion regarding the FDW modification. I appreciate your effort to improve its functionality.
I noticed that the proposed modification seems tailored for the Go client, which supports only InfluxDB v1. However, this modification might not extend the retention policy support to users of the cxx-client, as GetV1 in the cxx-client does not include a retention policy parameter. Here's the link of the code for reference: link You can easily find out the differance between GetV1 and GetV2.
Given this, the modification might only be feasible for Go client users. Could we explore ways to also accommodate cxx-client users? Your input is highly valued, and I look forward to any further suggestions you might have.
Hi,
I need a permission from my company to contribute to open-source, so I cannot submit a patch right now. However, I will apply for it.
This modification works with InfluxDB CXX client. I have not tested it with Go client.
Kind regards Antoni Jakubiak
Hello,
I am sending a patch for #50 and #51
0001-niffler-to-pgspider.patch.gz
If you are interested, I can join Your project, create a branch and pull request.
Kind regards Antoni Jakubiak
Hello, @antonijakubiak-samsung ! Please look at my PRs in https://github.com/pgspider/sqlite_fdw/commits/master/ In pgspider project usually there are only 20-25% of C/C++ code, 60-70% of tests and 10-15% of discussing about code. If you can provide tests for your changes this will be great and you'll have a 3-4 month perspective of merge. Did I scare your work?
Hi @mkgrgis.
I am not scared. Test code is obvious.
What is the contribution work-flow, e.g.: repository fork or feature branch?
Best Antoni Jakubiak
What is the contribution work-flow, e.g.: repository fork or feature branch?
Fork of repository -> creating a branch for new feature -> one full initial commit (code changes + documentation changes + tests) -> PR for pgspider team -> review ... other commits by review ...
Testing scripts for getting multi-versional PostgreSQL C environment is here.
Hi @mkgrgis
Sadly, compiling and testing your project is complicated for me. I think, it could depend on my system. Do you have more detailed instruction for compiling and testing?
Instead of writing the instruction we could use Dockerfile.
For example, here is a Dockerfile which I am using to compile InfluxDB FDW.
FROM postgres:15.6-bookworm
RUN apt-get update
RUN apt-get install -y postgresql-server-dev-15
RUN apt-get install -y bash vim git
RUN apt-get install -y build-essential cmake
RUN apt-get install -y libcurl4-openssl-dev
ARG INSTALL_DIR=/src
RUN mkdir $INSTALL_DIR
RUN \
cd $INSTALL_DIR; \
git clone https://github.com/pgspider/influxdb-cxx.git; \
cd influxdb-cxx; \
mkdir build; \
cd build; \
cmake .. -DINFLUXCXX_WITH_BOOST=OFF -DINFLUXCXX_TESTING=OFF; \
make; \
make install
RUN \
cd $INSTALL_DIR; \
git clone https://github.com/pgspider/influxdb_fdw.git; \
cd influxdb_fdw; \
make USE_PGXS=1 with_llvm=no CXX_CLIENT=1; \
make install USE_PGXS=1 with_llvm=no CXX_CLIENT=1
RUN ldconfig
Then:
docker build -t influxdb_fdw .
docker run --name influxdb_fdw -e POSTGRES_PASSWORD=secret --publish 5431:5432 -ti influxdb_fdw
psql -h localhost -p 5431 -U postgres -c 'create extension influxdb_fdw'
Could you teach me how to compile and test influxdb_fdw
, or could you provide a Dockerfile?
Kind regards Antoni Jakubiak
Do you have more detailed instruction for compiling and testing?
Yes. There is special common testing environment scripts https://github.com/pgspider/fdw_testing_scripts for all of FDWs of pgspider
project. This scripts allows to get multiversional PostgreSQL source code environment. Preconditions was described at this lines of a script. Hence you should read https://wiki.postgresql.org/wiki/Compile_and_Install_from_source_code and get InfluxDB environment for tests. I think you doesn't need docker
for compiling or for testing environment, but docker
is used for InfluxDB testing installation themself, see https://github.com/pgspider/influxdb_fdw/blob/master/init.sh
Also you can add GitHub auto testing by copying https://github.com/pgspider/sqlite_fdw/tree/master/GitHubActions and adopting this scripts to InfluxDB instead of SQLite.
Could you teach me how to compile and test influxdb_fdw, or could you provide a Dockerfile?
I hope this will enough for you, @antonijakubiak-samsung Any questions will be appreciate.
Hi,
I am trying to run the system tests.
I have the following results:
# +++ regress check in contrib/influxdb_fdw +++
# using temp instance on port 61696 with PID 12670
ok 1 - 16.0/aggregate 105 ms
ok 2 - 16.0/influxdb_fdw 1791 ms
not ok 3 - 16.0/selectfunc 853 ms
ok 4 - 16.0/extra/join 1220769 ms
ok 5 - 16.0/extra/limit 1180 ms
ok 6 - 16.0/extra/aggregates 1234906 ms
ok 7 - 16.0/extra/insert 5318 ms
ok 8 - 16.0/extra/prepare 305 ms
not ok 9 - 16.0/extra/select_having 50383 ms
ok 10 - 16.0/extra/select 694 ms
ok 11 - 16.0/extra/influxdb_fdw_post 123801 ms
ok 12 - 16.0/schemaless/aggregate 109 ms
ok 13 - 16.0/schemaless/influxdb_fdw 1151 ms
not ok 14 - 16.0/schemaless/selectfunc 942 ms
ok 15 - 16.0/schemaless/schemaless 110 ms
ok 16 - 16.0/schemaless/extra/join 1151438 ms
ok 17 - 16.0/schemaless/extra/limit 2065 ms
ok 18 - 16.0/schemaless/extra/aggregates 1287630 ms
ok 19 - 16.0/schemaless/extra/prepare 309 ms
not ok 20 - 16.0/schemaless/extra/select_having 40954 ms
ok 21 - 16.0/schemaless/extra/insert 12898 ms
ok 22 - 16.0/schemaless/extra/select 707 ms
ok 23 - 16.0/schemaless/extra/influxdb_fdw_post 168791 ms
not ok 24 - 16.0/schemaless/add_fields 719 ms
not ok 25 - 16.0/schemaless/add_tags 925 ms
not ok 26 - 16.0/schemaless/add_multi_key 1166 ms
It seems, that the test results depends on the environment settings. For example regr.diff: PostgreSQL sort order depends on the locale:
diff -U3 /workspace/PostgreSQL source/REL_16_0/postgresql/contrib/influxdb_fdw/expected/16.0/extra/select_having.out /workspace/PostgreSQL source/REL_16_0/postgresql/contrib/influxdb_fdw/results/16.0/extra/select_having.out
--- "/workspace/PostgreSQL source/REL_16_0/postgresql/contrib/influxdb_fdw/expected/16.0/extra/select_having.out" 2024-05-08 09:24:28.204016490 +0000
+++ "/workspace/PostgreSQL source/REL_16_0/postgresql/contrib/influxdb_fdw/results/16.0/extra/select_having.out" 2024-05-08 10:07:03.882337549 +0000
@@ -46,8 +46,8 @@
GROUP BY b, c HAVING b = 3 ORDER BY b, c;
b | c
---+----------
- 3 | bbbb
3 | BBBB
+ 3 | bbbb
The sort order depends on the locale settings:
root@87f204cadebb:/workspace/PostgreSQL source# echo -e "bbbb\nBBBB" | LC_ALL=C sort
BBBB
bbbb
root@87f204cadebb:/workspace/PostgreSQL source# echo -e "bbbb\nBBBB" | LC_ALL=en_US.utf8 sort
bbbb
BBBB
It means, that I should run the get_environment.sh
script with some environment settings.
Could you help me, to create such environment?
In order to reduce environment dependencies, we could create a containerized build environment. In such containerized environment, we could get a clean Linux distribution, get dependencies, build and test the software in 100% repeatable form. I'm sending example docker files which could be executed with docker compose up --build
my-docker.tar.gz.
Kind regards Antoni Jakubiak
Hello, @antonijakubiak-samsung !
- how about execution time, is it normal on a modern PC?
Tests for most of pgspider
FDWs is not optimized for fast executing. Testing time for sqlite_fdw
was longer than 20 min before moving SQLite DBs in /tmp
mounted as tmpfs. There is 1..1.5 min for all versions after this change. Unfortunately I have no InfluxDB or influxdb_fdw
experience...
- what is the default locale?
Look like this is C
locale as for other PostgreSQL-like tests. Please note, most of pgspider
FDWs yet not support PostgreSQL text encodings. Look like, text sorting and transforming additionally depends on system locale.
The support for encodings in sqlite_fdw
was added in https://github.com/pgspider/sqlite_fdw/commit/90dad0a3ae7b8ffd129149088e094276c439ed61 , but the tests was detached to separate file later. I think you can copy 70-80% of the test and adopt other for InfluxDB FDW context.
About SQL_ASCII
+ en_US
in your diff I doesn't understand a conceptual source of problem. For SQLite FDW no en_US
locale/collation is recommended or necessary, see https://github.com/pgspider/sqlite_fdw/blob/master/GitHubActions/install_locales.sh Maybe there is not normal FDW deparsing logic with InfluxDB?
- which golang version you are using for influxdb_fdw compilation?
This is a question to pgspider team. @t-kataym _, could you please ask some of authors of influxdbfdw about this?
- on which InfluxDB you are testing?
There is 3 supported InfluxDB versions, but README doesn't describe details of the testing environment
- do you have any special InfluxDB configuration.
This is also a question to pgspider team. I know nothing more about than docker file content.
@weiting1lee , maybe you can help about previous comment ?
Thank you, @mkgrgis
@AntoniJakubiak, I greatly appreciate your efforts in testing and your suggestion regarding the testing environment. Using a containerized environment is a great idea, and we are also considering employing GitHub Actions for testing, similar to what sqlite_fdw uses, to facilitate easier testing for all developers.
Which golang version are you using for influxdb_fdw compilation?
For support up to PostgreSQL 16.0, we are using Golang 1.20.4.
On which InfluxDB are you testing?
In our environment, we use InfluxDB v1: 1.8.10 and InfluxDB v2: 2.2.0.
Do you have any special InfluxDB configuration?
There are no special configurations in our setup. If you continue to encounter difficulties with this setup, please do share more details so we can assist you better. Looking forward to your contributions and further insights!
Does influxdb_fdw support retention policies?
It seems, that it reads only from the default retention policy on InfluxDB v1.