liyichao commented 9 years ago

Does this work with influxdb 0.9?

Dieterbe commented 9 years ago

no, i was planning to. but 0.9 is shaping up to be quite incompatible with how graphite does things (graphite's metrics can be aggregated at will, in influxdb you have to bundle up front into series and use tags), so i'm not sure yet how this will workout. it will depend on how well aggregations perform across different measurements, i think.

mnuessler commented 9 years ago

Would really be interested in support for 0.9 too! Any updates on that?

alvaromorales commented 9 years ago

Any updates/plans after today's influxdb 0.9 release?

go2planb commented 9 years ago

+1 on this request Looks like queries with graphite style metrics (select * from "hits.test.count") which include dots work if the name_separator is uncommented and left blank. Not sure if this is the ideal method but it would be amazing to get the graphite-web to read from the new influxdb version. Obviously this is days old so I am still testing as others are per the link below.

https://github.com/influxdb/influxdb/issues/2561

BTW, I was able to query metrics by using the settings below in /etc/opt/influxdb/influxdb.conf

Controls one or many listeners for Graphite data.

[[graphite]] enabled = true bind-address = ":2003" protocol = "tcp" consistency-level = "one" name-separator = " "

name-position = "last"

abhiofdoon commented 9 years ago

Hi Dieter,

Do we have support for influxdb 0.9 yet ? This support will be very helpful. Please let us know.

Thanks, Abhishek

Dieterbe commented 9 years ago

they broke their graphite input when building 0.9 (https://github.com/influxdb/influxdb/issues/2102) that, combined with their new un-support for graphite's metric organisation, and lack of ways to reconcile with their new tag system means you can't really use 0.9 as a graphite backend.

abhiofdoon commented 9 years ago

Another related question.

Influxdb has now removed JOINs in 0.9.

In Influxdb 0.8.8 they supported joining two series (you could not join more than two). With this limitation in Influxdb 0.8.8 how did graphite-influxdb supported graphite queries which worked across more than 2 series ?

Foe example &target=summarize(sum(xxx.collectd.memory.memory-{free,used,cached}),"1hours","avg")' This query sums (point-by-point) across 3 series and then calculate the average in 1 hour buckets.

How did you translate this graphite query to equivalent influxdb query ? How did the sum across 3 series work ?

Dieterbe commented 9 years ago

it would just query the series separately and do the processing in python.

abhiofdoon commented 9 years ago

Is it in graphite_influxdb.py ? Can you please point me to where this is implemented ?

abhiofdoon commented 9 years ago

Seems like this processing is done in graphite-api and not in graphite-influxdb.

In my usecase I have graphite metrics in influxdb 0.9 database with no tags. (Each measurement is its own series). With this kind of arrangement I hoped that graphite-influxdb would still work with graphite-api and influxdb 0.9, but it does not. I think it is mainly because of the influxdb API change (for example "list series" has been replaced by "show series"..).

How easy or difficult would it be to get the graphite-influxdb working for such simple use cases ?

Dieterbe commented 9 years ago

should be pretty easy, since influxdb has a new updated python client library for 0.9 you brought up a good point though. i forgot that graphite-api just queries the series individually so the discrepancy in tree vs tag model shouldn't be a big deal. performance might suffer a lot though. also note that their graphite input protocol is still broken afaik. I propose we rename the current influxdbreader and influxdbfinder to influxdb08 reader and finder, and then also create a 09 version. i personally don't have time/interest to do this, but maybe someone can take this on in a PR.

abhiofdoon commented 9 years ago

with respect to performance it should not be any worse than 0.8 i think.

Dieterbe commented 9 years ago

influxdb team has mentioned that 0.9 is now optimized for accessing fewer series (by using tags), and hinted that performance of addressing individual series could be slower. i guess only 1 way to find out :)

pkittenis commented 9 years ago

Just to let you guys know, we are working on 0.9 support at our fork and have rudimentary influxdb 0.9 support working.

Still WIP and code has breakpoints in it so run it manually under flask if you want to test.

A PR will be made once it's all been checked.

Also, I am ripping out both cache and statsd as they're being imported from graphite_api.app which is causing cyclical dependencies.

Dieterbe commented 9 years ago

ok sounds good!

pkittenis commented 9 years ago

@Dieterbe - Perhaps you could help clarify something for me :)

Am wondering what is fix_datapoints actually doing. Is it just filling missing values with null?

In 0.9, influxdb has a fill(null) function which is now being used in graphite_influxdb's queries so the, currently working, code at the fork is not making use of fix_datapoints and wanted to confirm that will not have any unintended side effects.

A 'it works' screenshot to wet your appetite :)

Grafana with graphite-influxdb and influxdb 0.9

pkittenis commented 9 years ago

Looks like its purpose is normalising data points according to step, so that, for example, multiple results for the same step interval are not counted as values for separate steps.

The latter is not possible in carbon, all results are already according to step/resolution.

Getting incorrect results without fix_datapoints so will re-work it for the new influxdb api.

Dieterbe commented 9 years ago

IIRC fix_datapoints quantizes datapoints and fills in gaps. I found the fill() function in influx 0.8 to be much slower than doing it in python. hopefully they adressed this by now. there should be little notes and comments throughout the implementation to explain what it does.

pkittenis commented 9 years ago

Thank you - there are indeed notes, I just did not fully understand what the code was supposed to be doing :)

Am finding fill(null) and group by time ($step) to be much faster and to produce good results so going with that for the moment.

Ideally want to avoid doing much in the python WSGI as the webapp thread would hang for the duration of code execution, as long as the influxdb native implementation is faster of course.

Dieterbe commented 9 years ago

ok sounds good. glad to hear it works faster now

pkittenis commented 9 years ago

PR submitted and bears gifts :). Lots of changes, discussed at PR #48

Separately, we have a fork and a release candidate which goes further than InfluxDB 0.9 support and also makes the following changes:

Removed graphite-web support - only graphite-api supported.
Removed elasticsearch integration - caching of series names and other queries can be solved at a higher level, for example an HTTP cache in front of the graphite-api webapp which would cache series names and all other requests. We have been using exactly that successfully.
100% code test coverage
Package refactor - moved files into package directory, split into multiple modules et al.
Automatic versioning.
Simplified configuration - only InfluxDB database name for Graphite metric series is required.
Strict flake-8 compatibility and code test coverage. This project has 100% code test coverage.
Python 2.6, 2.7 and 3.4 all fully supported with automated testing.
Schema-less design - no requirement for pre-configured schemas. Data point interval is calculated automatically based on time range specified, ala Grafana.
Explicitly requiring a non-official Graphite-API release with multi-fetch support (latest official release of Graphite-API predates multi fetch support).

Enjoy :)

Feel free to pull from master branch instead if you are happy with all the changes. #48 limits itself to influxdb 0.9 support, this issue, and #41.

vimeo / graphite-influxdb

support for influxdb 0.9 #43

Controls one or many listeners for Graphite data.

name-position = "last"