Mathematics across measurements

srfraser commented 9 years ago

Apologies if this is a duplicate, I had a look and couldn't see a relevant issue.

I can see from the documentation how to select from multiple measurements (although it calls them series, still, at https://influxdb.com/docs/v0.9/query_language/data_exploration.html )

For example, with data inserted by telegraf, you can do: select * from disk_used,disk_total where host = 'myhostname' and path = '/'

How would you express that as a percentage? I've tried variations of the following, and none seem to work:

select disk_used.value/disk_total.value from disk_used, disk_total where host = 'myhostname' and path='/'

The "mydb"."retentionpolicy"."measurement" syntax doesn't work there, either.

Is it a good idea to add aggregation functions for cases like diff(value1, value2) from m1, m2 and divide(value, value) from m1, m2, or should the arithmetic operators be working?

Also, I noticed when experimenting that it's also not possible to divide one derivative by another. For example, if I have two counters, bytes transferred and api calls made - both of which are constantly going up - how would you calculate the mean bytes per api call?

hexluthor commented 9 years ago

:+1: I work with sensor networks and find this limitation frustrating. For example, I wish to compute weighted averages like this: SELECT sum(oxygen_percentage.value * flow_rate.value) / sum(flow_rate.value) FROM oxygen_percentage, flow_rate WHERE site_id = '3' But InfluxDB returns nothing. Even SELECT oxygen_percentage.value FROM oxygen_percentage doesn't work. Using 0.9.3-rc1 master (0163945).

ghost commented 9 years ago

Same here. I'd also like to calculate values across different series like:

select * from mysql_value where type='mysql_commands' and type_instance='show_tables' + select * from mysql_value where type='mysql_commands' and type_instance='show_databases'

Cheers, Szop

bbinet commented 9 years ago

same as @hexluthor, I feel this is very limiting: if we need to correlate data coming from various sensors we currently have to write all data as fields in the same measurement... But would it be a good idea in terms of data structure to have a single measurement with more than 50 fields? Will it impact query performance? And this sensor data does not always get logged with the same sampling frequency, so this is not always possible to combine data in the same measurement if we want to keep data with high sampling frequency.

I'm not comfortable with distorting the data structure (dropping natural data organization) because of technical limitations. In the sysadmin world, it would be like putting all the cpu, ram, disk, and apache response time metrics in the same measurement for the sole purpose of being able to correlate apache response time with cpu, ram, or disk metrics.

bbinet commented 9 years ago

Also, what are the actual technical issues that prevent InfluxDB to support queries with simple math operations across measurements?