influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
29.06k stars 3.56k forks source link

SELECT * INTO does not copy all measurements #18132

Open vially opened 4 years ago

vially commented 4 years ago

Steps to reproduce:

The commands below were run in the influx shell connected to a new InfluxDB server instance.

> create database playground
> use playground
Using database playground
> insert temperature,sensor=foo value=17.2
> insert temperature,sensor=foo value=14.6
> insert state,sensor=foo value="pending"
> insert state,sensor=foo value="complete"
> select * from temperature
name: temperature
time                sensor value
----                ------ -----
1589808323070934007 foo    17.2
1589808329269537200 foo    14.6
> select * from state
name: state
time                sensor value
----                ------ -----
1589808336872365263 foo    pending
1589808349599548530 foo    complete
> create database playground_copy
> SELECT * INTO playground_copy..:MEASUREMENT FROM /.*/ GROUP BY *
name: result
time written
---- -------
0    2
> use playground_copy
Using database playground_copy
> show measurements
name: measurements
name
----
temperature

These commands are supposed to duplicate a database containing two measurements (with two data points each):

It looks like the SELECT * INTO command only copied the temperature measurement to the new database. The state measurement was not copied over.

For some reason it looks like this might have something to do with the value field of the state measurement being of type string. I noticed a similar behavior with boolean fields too.

I'm not sure if this is a bug or a known issue, but I wasn't able to find anything related to this behavior in the documentation.

Expected behavior:

I was expecting both the temperature and the state measurements to be copied over to the playground_copy database.

Actual behavior:

Only the temperature measurement was actually copied. The state measurement was missing from the playground_copy database.

Environment info:

Config:

The config was not modified in any way. The server was running using the default config that comes with the 1.8.0 docker image.

iwittkau commented 4 years ago

@vially do you have any updates on this? I might be experiencing a similar issue where not all values are selected into a new measurement via the Go client. If I run the same query via the influx tool, all values get copied.

iwittkau commented 4 years ago

@vially do you have any updates on this? I might be experiencing a similar issue where not all values are selected into a new measurement via the Go client. If I run the same query via the influx tool, all values get copied.

I think I found the answer to my problem in the faq.

lafrech commented 3 years ago

I think I'm facing the same issue with influxd 1.8.3.

use "source-db"
show measurements
select * into "destination-db"..:MEASUREMENT from /.*/ GROUP BY *
use "destination-db"
show measurements

The measurement list in the destination database is incomplete and from a quick comparison, only the measurements with numerical values are copied.

I'm afraid this is a serious issue as AFAIK

use src_db
SELECT * INTO dest_db..:MEASUREMENT FROM /.*/ GROUP BY *

is the recommended way to copy all data from one database to another (here for instance).

lafrech commented 3 years ago

@vially I've been investigating and I think this is the consequence of an issue with FROM regex.

See #20365.

vially commented 3 years ago

@lafrech It does look like that could be the root cause for this issue. Great find!