raintank / raintank-docker

raintank docker images and dev stack DEPRECATED / UNMAINTAINED
https://blog.raintank.io/docker-based-development-environment/
16 stars 4 forks source link

various issues getting the stack to work #55

Closed gugansankar closed 8 years ago

gugansankar commented 8 years ago

Hi,

I run the build script and all the containers are shown there. But I'm unable to connect grafana using http:// . When I check the logs, it says .bra.toml not found error.

root@grafana:/go/src/github.com/raintank/grafana# tailf /var/log/raintank/grafana-dev.log [Bra] 01-21 11:34:25 [ INFO] App Version: 0.4.0.1229 [Bra] 01-21 11:34:25 [FATAL] .bra.toml not found in work directory [Bra] 01-21 11:34:32 [ INFO] App Version: 0.4.0.1229

Please help me

Dieterbe commented 8 years ago

@gugansankar can you tell me which commit hash you're on in your grafana directory, and what does git status say? the .bra.toml file should be in there normally.

gugansankar commented 8 years ago

Not sure, why its not there on my docker.

screenshot from 2016-01-22 11 20 09

Dieterbe commented 8 years ago

please show me the output of these two commands:

git status
git show --stat
gugansankar commented 8 years ago

Thanks for your continuous follow up on this issue Appreciate it.

raintank-docker

Dieterbe commented 8 years ago

i meant in the /go/src/github.com/raintank/grafana directory

gugansankar commented 8 years ago

root@grafana:/go/src/github.com/raintank/grafana# git status
fatal: Not a git repository (or any parent up to mount point /go/src/github.com/raintank/grafana) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). root@grafana:/go/src/github.com/raintank/grafana# git show --stat fatal: Not a git repository (or any parent up to mount point /go/src/github.com/raintank/grafana) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). root@grafana:/go/src/github.com/raintank/grafana#

gugansankar commented 8 years ago

There are the containers started when I run grafana bin file.

raintank1

gugansankar commented 8 years ago

Even I tested the same on Fresh Centos Machine. Still same issue.Not sure whether I made any mistake while installing it.

Steps I used to install :

  1. Installed docker using yum
  2. Installing docker-compose using, curl -L https://github.com/docker/compose/releases/download/1.1.0/docker-compose-`uname -s-uname -m` > /usr/local/bin/docker-compose
  3. Clone this git repository git clone https://github.com/raintank/raintank-docker.git
  4. cd raintank-docker/
  5. ./build_all.sh
  6. docker-compose -f fig-dev.yaml up -d

Once done, Same set of docker containers are started as like screenshot mentioned on my last reply. but still getting same error

raintank2

Please help me to fix this issue

Dieterbe commented 8 years ago

ok our readme was confusing and out of date. sorry about that. the new instructions should clear it up, see https://github.com/raintank/raintank-docker/commit/227542aa305b4601059092108c2973814c2a4685

gugansankar commented 8 years ago

Thanks for updating the Readme file. Now all the containers are started.

But still there is no data shown on my dashboard. No issue on creating probes. Seems, data is not storing on the end path.

I think, you are storing data in Graphite . Please correct me if it wrong.

Also Please give some input on this issue.

Screenshot for your reference, Graphite Log: raintank3

Probes status: raintank4

Dashboard: raintank5

Endpoints : with Nodata status raintank6

Dieterbe commented 8 years ago

That graphite instance is just for monitoring the dev stack itself. Your measurement data goes into a graphite compatible system we built using graphite-api and a storage service called metric_tank which uses Cassandra. We are right in the middle of a transition with it though, you have to clone raintank/raintank-metric project in raintank_code, check out the metricDefsRefactor branch in it, run ./build.sh in it (it it doesn't get the needed dependencies, run go get ./... before build.sh) and then restart the docker stack. We will iron this out shortly but for now that should do it. If it still doesn't work, what always helps is going into the screen session, looking into all the tabs for the different apps and looking for any errors or crashes.

gugansankar commented 8 years ago

Still not working even I followed steps you mentioned on your last reply. This time, I start debugging the each container logs where I can get some error message.

Error Once I added the endpoints [ Log from grafana-dev.log] raintank10

Graphite-Api container [ log from graphite-flask.log] raintank11

Also , in Metric_tank container, the application is not started. After checked the supervisord log, seems bin file path is mentioned wrongly. raintank12

Even If I set it to correct one , its not working raintank13

Kindly check and help me to fix it.

Dieterbe commented 8 years ago

For some reason the statsd is not up, which causes metric tank to fail which causes the graphite errors which cause the grafana errors. Can you see an error in the statsd output or container? Try "docker logs 'container id' "

gugansankar commented 8 years ago

Hi, I couldn't see any error logs in statsd container.

But I noticed 1 point here that may help you to check this further on this.

  1. Statsd container ports are mapped with 0.0.0.0:8126->8126/tcp, which means tcp connection only enabled here to accept.
  2. Hope, you may noticed on my last reply images [ Especially StatsD not started images]. Its trying to connect udp port of statsD docker container. I think, this is the cause for metric_tank not running up here. Please correct me if its wrong. Once again I attached that metric_tank connection error log with statsD here for your reference, raintank22

Docker container Log: raintank20

Established status Inside the statsd container: raintaink21

gugansankar commented 8 years ago

Any chance to check the above one?. Thanks

Dieterbe commented 8 years ago

You're right the 8125 udp mapping is missing from the fig-dev.yaml however interestingly it has always worked fine for me despite this. We did recently switch statsd clients to one that does the listening check at startup. For some reason netstat says the 8125 socket is not in listening state. That may be the problem.

gugansankar commented 8 years ago

Ok, Could you please advice me to move further on this ?. Also , Are you going to apply the above mentioned fix on the containers ?.

Dieterbe commented 8 years ago

I'm in a hospital with a broken leg. I'll take me at least a few days. Not much more I can do for now.

gugansankar commented 8 years ago

Please take care of your health. No matter for replying to me on this. And Hats off to you :+1: , you replying to me even on this situation.

As its a great project, that's why I keep on checking it and update the status to you. Once again thank you so much for your kind replies and take care of your health. :+1:

gugansankar commented 8 years ago

Hi , Any chance to check this ?.

Dieterbe commented 8 years ago

actually now that i think about it..i was about to commit:

--- a/fig-dev.yaml
+++ b/fig-dev.yaml
@@ -145,6 +145,7 @@ statsdaemon:
   image: raintank/statsdaemon
   hostname: statsdaemon
   ports:
+    - "8125:8125"
     - "8126:8126"
   links:
           - graphitemon:graphitemon

but then i realized this section is to expose ports to the host. the containers internally can talk to statsdaemon because they use the "links" feature.

gugansankar commented 8 years ago

Hi, still no luck. Now I tried to re-install this on my environment from scratch. This time, I tested this on Ubuntu14.04.

Unfortunately, this time its got failed in build.sh process. I just clone the fresh raintank-docker and run the build script

git clone https://github.com/raintank/raintank-docker cd raintank-docker/ ./build.sh

This process got failed with the below error

screenshot from 2016-02-26 15 55 31

screenshot from 2016-02-26 15 59 14

Dieterbe commented 8 years ago

the first one is ok, it's because grafana uses godep to set the deps to the right versions so during a regular build there can be api mismatches. that's why it skips on such errors

for the second problem, try applying this patch and then try again

diff --git a/grafana/Dockerfile b/grafana/Dockerfile
index b80319a..5b6d9ec 100644
--- a/grafana/Dockerfile
+++ b/grafana/Dockerfile
@@ -16,8 +16,8 @@ RUN go get github.com/Unknwon/bra
 RUN go get github.com/tools/godep
 WORKDIR /go/src/github.com/grafana/grafana
 RUN go run build.go setup
-RUN godep restore
-RUN go build .
+#RUN godep restore
+RUN godep go build .
 RUN npm install && npm install -g grunt-cli && grunt build

 EXPOSE 80
gugansankar commented 8 years ago

Thanks for patch. Hope, it will fix the reported issue and get back to you if I hit with any other errors.

gugansankar commented 8 years ago

Thanks. fyi, Given patch is working for me to run ./build_all.sh

gugansankar commented 8 years ago

Getting some error message on dashboard after adding probe. Could you please check and help me to fix this issue

11

gugansankar commented 8 years ago

Hi Above reported problem is fixed.

Could you please help me on why the Status is not updated on the below panels, { the hearten symbol color is not updated , not sure from where it will fetch the value ). I couldn't find any error messages. Could you please direct me the exact path to find the issue ?.

12

13

Dieterbe commented 8 years ago

@gugansankar in screen, go through all the tabs and make sure all processes are running fine. in particular, the hearts should be filled in by the alerting consumer (even when alerting is not enabled, that just means no notifications, the consumer should still process jobs and set the right state). check the grafana log to see if it's properly processing, or the included alerting dashboard to see if jobs are being created and executed.

Dieterbe commented 8 years ago

i updated my stack and see the same problem as you. i'm seeing errors in the grafana log like

2016/02/27 01:31:21 [D] job results - job:<Job> monitorId=3 generatedAt=2016-02-27 01:31:21.000969118 +0000 UTC lastPointTs=2016-02-27 01:30:51 +0000 UTC definition: <CheckDef> Crit: ''sum(t(streak(graphite("litmus.localhost.*.ping.error_state", "30s", "", "")) == 3 , "")) >= 3' -- Warn: '0' err:non-fatal: "graphite ParseErrors [expected >= 3 series. got 1]:\nTrace: {Request start:2016-02-27 01:30:21 +0000 UTC end:2016-02-27 01:30:51 +0000 UTC targets:[litmus.localhost.*.ping.error_state] url:http://graphite-api:8888/render?format=json&from=1456536621&target=litmus.localhost.%2A.ping.error_state&until=1456536651 Response:HTTP/1.1 200 OK\r\nConnection: close\r\nContent-Length: 127\r\nCache-Control: max-age=60\r\nContent-Type: application/json\r\nDate: Sat, 27 Feb 2016 01:31:21 GMT\r\nExpires: Sat, 27 Feb 2016 01:32:21 GMT\r\nLast-Modified: Sat, 27 Feb 2016 01:31:21 GMT\r\nServer: gunicorn/19.4.5\r\n\r\n[{\"target\": \"litmus.localhost.dev1.ping.error_state\", \"datapoints\": [[0.0, 1456536630], [0.0, 1456536640], [0.0, 1456536650]]}]}" res:Unknown

for some reason it's expecting 3 series but we only have 1 probe (and hence series). i'll figure this out and let you know once it's fixed.

Dieterbe commented 8 years ago

@gugansankar this is because of a UX bug that only manifests itself on installations with a very small number of collectors, such as the default configuration of this stack. see https://github.com/raintank/grafana/issues/480 for more details

what you can do to work around it is make sure that the number of probes in the alerting configuration is set to 1, like so: num_coll

you can also run ./launch_dev_collector.sh a few times after you did ./launch_dev.sh and change the footprint of your monitor so that it uses 3 or more collectors. but of course the first solution is simpler.

i just tried both approaches and they solve the problem for me. lemme know how it works for you.

gugansankar commented 8 years ago

Hi, Above trick works awesome.

16

One more problem I faced, I tried to install and run the probe manually from different region, its added successfully on the probe panel, But it doesn't reflect on the graph template section. its only show the default collector "dev1" not showing my custom probe.

14

In above log, you can see I added additional two probes. But I couldn't see the same on dashboard. Not sure , whether I need to add this settings somewhere else, please guide me.

15

Dieterbe commented 8 years ago

when you want to use the new probes, you should enable them for your endpoint in the endpoint configuration. once that has happened and they've submitted some data, they will be available to choose in your dashboard.

gugansankar commented 8 years ago

Thanks for your info , Its working now and I can see the different probe option in dashboards.

But, hitting with one more error when I tried to view Events dashboard. [ it returns 500 Error ]

17

When I check the elasticsearch search container, I could see the data/indices are available there.

18

gugansankar commented 8 years ago

Any chance to check this , why the value is unable to retrieve from Elasticsearch ?. When i query the elasticsearch manually i could see the records there.

gugansankar commented 8 years ago

Still I'm unable to fix that issue. could you please help me to fix that one ?.

Dieterbe commented 8 years ago

not sure. maybe @woodsaj has an idea.

Dieterbe commented 8 years ago

closing this since we just refactored the stack to work with grafana3 and the new WP app. we still have to work out a few kinks i think though, see #60