influxdata / influxdb-client-python

InfluxDB 2.0 python client
https://influxdb-client.readthedocs.io/en/stable/
MIT License
721 stars 187 forks source link

Missleading warning in query "buckets()" #568

Open kakila opened 1 year ago

kakila commented 1 year ago

Specifications

  1. Create a client to a local influx server.
  2. Query the buckets

the last step prints a warning messages that provides a non-working solution. The solution does not apply to the returned table.

Code sample to reproduce problem

>>> from influxdb_client import InfluxDBClient
>>> c = InfluxDBClient(url="http://localhost:8086", org=..., token=...); 
>>> r_api = c.query_api()
>>> r_api.query_data_frame(query='buckets()')
/home/juanpi/virtual_enviroments/wabesense-data/lib/python3.10/site-packages/influxdb_client/client/warnings.py:31: MissingPivotFunction: The query doesn't contains the pivot() function.

The result will not be shaped to optimal processing by pandas.DataFrame. Use the pivot() function by:

    buckets() |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")

You can disable this warning by:
    import warnings
    from influxdb_client.client.warnings import MissingPivotFunction

    warnings.simplefilter("ignore", MissingPivotFunction)

For more info see:
    - https://docs.influxdata.com/resources/videos/pivots-in-flux/
    - https://docs.influxdata.com/flux/latest/stdlib/universe/pivot/
    - https://docs.influxdata.com/flux/latest/stdlib/influxdata/influxdb/schema/fieldsascols/

  warnings.warn(message, MissingPivotFunction)
    result  table         name                id    organizationID retentionPolicy  retentionPeriod
0  _result      0  _monitoring  9ff20ba199aff20d  1f4f3678a832d418            None  604800000000000
1  _result      0       _tasks  7de84af1deb77ca4  1f4f3678a832d418            None  259200000000000
2  _result      0      scratch  c7902563eb5f9007  1f4f3678a832d418            None  604800000000000

Expected behavior

Either the suggested solution should be correct, or no warning should be emitted.

Actual behavior

A solution is provided that does not apply to the returned table.

Additional info

No response

chintal commented 1 year ago

The same issue arises for certain other queries where pivot is meaningless. For example, from the schema module:

import "influxdata/influxdb/schema"

schema.measurementTagValues(
  bucket: "monitors",
  measurement: "temperature",
  tag: "identifier",
)

 |> distinct(column: "_value")

produces the warning:

MissingPivotFunction: The query doesn't contains the pivot() function.
The result will not be shaped to optimal processing by pandas.DataFrame. Use the pivot() function by:

    import "influxdata/influxdb/schema"

schema.measurementTagValues(
  bucket: "monitors",
  measurement: "temperature",
  tag: "identifier",
)

 |> distinct(column: "_value")

 |> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")

Here:

If the warning does serve a purpose elsewhere in the codebase, then this problem eliminates that possibility. Personally, I don't think blindly checking for a pivot is the way to go here. There are other ways in which even regular queries can be constructed to return data frame friendly results, and the warning gives pause when a pause is not needed. I think leaving it to the user to look at the dataframe and reshape to get whatever is needed is perfectly acceptable, instead of the choice the library presently gives of :

a) forcing a pivot b) forcing a potentially new user to deal with and take explicit measures to suppress a particularly loud warning

For reference, this is what the (unpivoted) reponse looks like:


    result  table     _value
0  _result      0     acpitz
1  _result      0     amdgpu
2  _result      0        cpu
3  _result      0        gpu
4  _result      0  iwlwifi_1
5  _result      0    k10temp
6  _result      0       nvme