travelping / exometer_influxdb

Exometer reporter for InfluxDB
Mozilla Public License 2.0
36 stars 31 forks source link

Long http request to influxdb #24

Closed netDalek closed 7 years ago

netDalek commented 8 years ago

Hello. Sorry for my english. I've found really strange bug. I can't imagine how to debug it. I measure schedulers usage each second. Use gauge probe. For example scheduler usage of idle app is 0.1. When I add some load to my application I can see scheduler usage 0.5 with exometer:get_value([metric, scheduler_usage]). But new values from influxdb stay 0.1. Only after few minutes new values reported from exometer_influxdb becomes 0.5. With graphite reporter everything is ok. I can't reproduce it in a simpler example for now.

exometer:new([metric, scheduler_usage], gauge, []),`
spawn_link(fun scheduler_usage/0),
scheduler_usage() ->
    Ts0 = lists:sort(erlang:statistics(scheduler_wall_time)),
    timer:sleep(1000),
    Ts1 = lists:sort(erlang:statistics(scheduler_wall_time)),
    {A, T} = lists:foldl(
        fun({{_, A0, T0}, {_, A1, T1}}, {Ai,Ti}) ->
            {Ai + (A1 - A0), Ti + (T1 - T0)}
        end,
    {0, 0}, lists:zip(Ts0, Ts1)),
    exometer:update([metric, scheduler_usage], A/T),
    scheduler_usage().
  {exometer, [
      {report, [
          {reporters, [
              {exometer_report_influxdb, [
                  {protocol, http}, 
                  {host, <<"some hosy">>},
                  {port, 8086},
                  {db, <<"dev">>}
              ]}
          ]},
          {subscribers, [
              {exometer_report_influxdb, [metric, scheduler_usage], value, 1000, true, []}
          ]}
      ]}
  ]}
netDalek commented 8 years ago

I've found real reason. hackney:send_request take 200-300ms

surik commented 8 years ago

Hi @netDalek. Thanks for your report. I think 200-300 ms for one hackney:send_request is a lot. I will take a look on it. For good understanding what's happened can you show me result of exometer_report:list_subscriptions(exometer_report_influxdb)? And if you have influxdb logs it is very interesting to know how much time each request takes on its side.

As solution I may suggest to use UDP or batch sending.

GalaxyGorilla commented 8 years ago

@netDalek: In the problem description you mentioned that you have to wait minutes for new values. So that does not really comply with the 200-300ms. Maybe there is also another bug :3

surik commented 7 years ago

I close it. If you still have the problem feel free to reopen this issue.