InfluxCommunity / influxdb3-java

The Java Client that provides a simple and convenient way to interact with InfluxDB 3.
https://InfluxCommunity.github.io/influxdb3-java/apidocs/com/influxdb/v3/client/package-summary.html
MIT License
12 stars 3 forks source link

Odd Timestamp Precision Behaviour #154

Closed jeffreyssmith2nd closed 1 month ago

jeffreyssmith2nd commented 1 month ago

Specifications

Code sample to reproduce problem

public class TimezoneTest {
    public static void main(String[] args) throws Exception {
        String host = "http://localhost:8082";
        char[] token = "my-token".toCharArray();
        String database = "new_db";

        try (InfluxDBClient client = InfluxDBClient.getInstance(host, token, database)) {
            String sql = "select time from mem order by time desc limit 10";
            try (Stream<PointValues> stream = client.queryPoints(sql)) {
                List<PointValues> values = stream.collect(Collectors.toList());
                values.forEach(p -> {
                    System.out.println(p.getTimestamp());
                });
            }
            // ...
        }

        try (InfluxDBClient client = InfluxDBClient.getInstance(host, token, database)) {
            String sql = "select time from mem order by time desc limit 10";
            try (Stream<Object[]> stream = client.query(sql)) {
                stream.forEach(row -> {
                    System.out.println(row[0]);
                });
            }
            // ...
        }
    }
}

Expected behavior

Whether timezone support is enabled or not, I would expect all times to be treated with Nanosecond precision. When printing the raw number format, I would expect a number like 1591894470000000000000000. When pretty printing the time, I would expect a string like 2020-06-11T16:54:30.123456789 (where the digits after the decimal could be omitted if they were all zeros).

If timezone support is enabled, I would expect the pretty printed format to also have a Z at the end. The Flight Python libraries are a good reference for the behaviour I would expect.

Actual behavior

We recently (Mon July 8th) enabled timezone support in one serverless environment (us-east-1) and have noticed unexpected behaviour in the Java client. My test code executes a queryPoints and query, using the same SQL.

If I run without timezones enabled (the default behaviour before Monday) I get the following results:

queryPoints:
1591894470000000000

query:
2020-06-11T16:54:30

If I run with timezones enabled (the behaviour now) I get the following results:

queryPoints:
1591894470000000000000000

query:
1591894470000000000

Note that in both cases (timezone enabled/disabled), the times are always returned in Nanoseconds but the Java client does not always display them in Nanoseconds.

Additional info

No response

bednar commented 1 month ago

Hi @jeffreyssmith2nd,

Thank you for testing our implementation and for your detailed feedback.

Let me clarify the intended use and behavior of the queryPoints and query functions within our system:

  1. query(): This function is designed to return a stream of rows, with each row mapped as a friendly type object from the vector, as defined by Arrow's ValueVector.getObject(int). The purpose of query is to serialize data to values depending on their Arrow Flight Type. If the underlying Flight type changes, the returned values may also change accordingly. For example, the formatted string 2020-06-11T16:54:30 represents the toString value of the original Arrow Flight Type probably DateMilliVector.

  2. queryPoints(): This function returns a stream of points (com.influxdb.v3.client.PointValues), which are structured for potential re-ingestion into the database after operations like downsampling. The PointValues are structured to store values correctly for database ingestion, e.g., timestamps in nanoseconds.

Regarding your observations and the discrepancy when timezones are enabled:

For pretty printing:

Thank you once again for your engagement! 👍

Best

bednar commented 1 month ago

The results were parsed as follows:

Using our fixed version of the client (#153) in conjunction with queryPoints will ensure that the results are always correct timestamps in nanoseconds.