grafana-toolbox / grafana-wtf

Grep through all Grafana entities in the spirit of git-wtf.
GNU Affero General Public License v3.0
159 stars 16 forks source link

Doesn't work for simplest example #113

Closed interfan7 closed 2 months ago

interfan7 commented 10 months ago

I've not attempted to put effort checking it. I've just followed the usage in the most simplistic way.

$ grafana-wtf info --format=yaml
2024-01-25 17:03:36,895 [grafana_wtf.commands                ] INFO   : Using Grafana at {Grafana endpoint URL}
2024-01-25 17:03:36,905 [grafana_wtf.core                    ] INFO   : Response cache will expire after 3600 seconds
2024-01-25 17:03:36,908 [grafana_wtf.core                    ] INFO   : Response cache database location is /home/{user}/.cache/grafana-wtf.sqlite
Traceback (most recent call last):
  File "/home/{user}/.local/bin/grafana-wtf", line 8, in <module>
    sys.exit(run())
  File "/home/{user}/.local/lib/python3.8/site-packages/grafana_wtf/commands.py", line 322, in run
    response = engine.info()
  File "/home/{user}/.local/lib/python3.8/site-packages/grafana_wtf/core.py", line 268, in info
    version=health.get("version"),
AttributeError: 'str' object has no attribute 'get'

The Grafana endpoint and access is proven to work with this:

$ curl -H "Authorization: Bearer {token}" {Grafana endpoint}/api/dashboards/home
$

Grafana v8.3.3 (30bb7a93ca)

amotl commented 10 months ago

Dear Leonid,

thank you for writing in. That looks like a bug, we will look into it.

Could you support us by sharing what comes back when requesting the /api/health endpoint on your Grafana instance? This is the spot where the relevant code sends a request to.

If all goes well, it would receive such a response:

{
  "commit": "<anything>",
  "database": "ok",
  "version": "8.3.3"
}

... which, when decoded, yields a dictionary, where accessing the version key should work as intended.

With kind regards, Andreas.

NB: Please also tell us about the version of grafana-wtf you are using.

amotl commented 10 months ago

Hi again,

my best guess is that the URL to the Grafana instance is slightly wrong. Maybe you can share its shape, while still redacting its content, so we can inspect if it is semantically correct?

If the client hits an URL which does not respond with an application/json response, I guess it can lead to such a situation you are observing. Saying that, sure enough the error handling could be improved, to tell the user better about this situation.

Thank you for reporting, we will implement that error handling improvement on behalf of the next development iteration.

With kind regards, Andreas.

interfan7 commented 10 months ago

@amotl Thank you for those kind responses.

Edit: you may prefer to skip to my next comment. I've overcome the JSON issue by querying the API from within the GF 8 server itself. Although slightly less than ideal, that's great for me now.


I've tried to access the GF from curl, grafana-wtf and Python script. All have failed because the returned response is not JSON but a content-type: text/html. I agree the error handling could be more informative.

Mind you when I changed the Python script to access GF 10 (not yet deployed to prod) it has worked. The GF 8 URL returns code 302 initially which is a redirection, then it returns code 200 albeit a text rather then JSON. My curl command is: curl -L -H 'accept: application/json' -H 'content-type: application/json' -H "Authorization: Bearer {our token}" 'https://grafana-{company name shortcut}alpha.via.{company name}.com/api/health'

It seems that it doesn't matter which path (e.g. /api/...) I throw at it, or whether I even provide the bearer (token). It returns that PHP/HTML (pardon me for ignorance or not being curios enough to confirm) text.

NB: Please also tell us about the version of grafana-wtf you are using.

The absolutely newest for about a week ago. It's my 1st attempt to use this tool. I've fallen back to Python/curl only because I thought maybe it will work for me whereas the tool doesn't.

interfan7 commented 10 months ago

It seems to work from within the GF8 service itself.

However there is still an issue. Bt running this:

grafana-wtf info --grafana-url=https://localhost:3000/?verify=no --grafana-token={my token} --format=yaml

I get this log which ends with an error:

2024-01-29 12:05:10,543 [grafana_wtf.commands                ] INFO   : Using Grafana at https://localhost:3000/?verify=no
2024-01-29 12:05:10,549 [grafana_wtf.core                    ] INFO   : Response cache will expire after 3600 seconds
2024-01-29 12:05:10,551 [grafana_wtf.core                    ] INFO   : Response cache database location is /home/{my name}/.cache/grafana-wtf.sqlite
2024-01-29 12:05:10,554 [grafana_wtf.core                    ] INFO   : Scanning dashboards
2024-01-29 12:05:10,557 [grafana_wtf.core                    ] INFO   : Found 553 dashboard(s)
0it [00:00, ?it/s]
2024-01-29 12:05:10,560 [grafana_wtf.core                    ] INFO   : Fetching dashboards in parallel with 5 concurrent requests        | 0/553 [00:00<?, ?it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 553/553 [00:00<00:00, 644.58it/s]
2024-01-29 12:05:13,117 [grafana_wtf.core                    ] INFO   : Scanning datasources█████████████████████████████▋     | 529/553 [00:00<00:00, 766.03it/s]
2024-01-29 12:05:13,122 [grafana_wtf.core                    ] INFO   : Found 132 data source(s)
2024-01-29 12:05:13,138 [grafana_wtf.core                    ] ERROR  : Computing basic statistics failed: Client Error 403: Permission denied

After the log there is also an output. I'm afraid it may be incomplete due to the error message.

amotl commented 10 months ago

Hi again.

Mind you when I changed the Python script to access GF 10 (not yet deployed to prod) it has worked.

This sounds promising.

I get this log which ends with an error: Using Grafana at https://localhost:3000/?verify=no Found 553 dashboard(s) Found 132 data source(s) Computing basic statistics failed: Client Error 403: Permission denied

Weird. And this is only happening on your Grafana 8 instance against prod? Why localhost then?

It seems to work from within the GF8 service itself.

Ah, you are now running grafana-wtf on the same host, right?

I think you will still need to address it by using a full canonical URL, and not localhost, regardless on which host you are running grafana-wtf. However, I never invoked grafana-wtf in such a scenario, so I am not 100% sure.

Maybe indeed grafana-wtf has a flaw in this regard, and would need a fix. It will be sweet to be able to resolve this problem in one way or another, to support your environment better. However, currently I don't have any straight idea how to approach this problem further, without getting access to your premises, or other kinds of remote debugging sessions.

Maybe grafana-wtf needs corresponding --verbose or --debug options to improve self-servicing on such occasions?

amotl commented 8 months ago

Hi again.

It seems to work from within the GF8 service itself.

However there is still an issue:

Computing basic statistics failed: Client Error 403: Permission denied

Have you been able to resolve this issue, or does it still haunt you?

interfan7 commented 8 months ago

Hi :-)

When I had the issue of getting not getting JSON structure from GF, the grafana-wtf had the reported issue. When I've resolved this issue by accessing GF in another networking means (i.e. just HTTP, not HTTPS), the grafana-wtf seemed to work except this error:

[grafana_wtf.core                    ] ERROR  : Computing basic statistics failed: Client Error 403: Permission denied

however it wasn't much useful for me by then, because I've had already some quick Python script which fetches from me panels for a given query, not just dashboards, so at that point I've stopped trying grafana-wtf.

I bottom line you can add check that the returned value is indeed a JSON, otherwise exit with a clear message.

amotl commented 8 months ago

Dear Leonid,

we looked a bit closer into the issue, and want to salute you for discovering an edge case. We have been able to reproduce the problem by whipping up a quick HTTP server which responds to /api/health requests with a basic string.

import responder

api = responder.API()

@api.route("/api/health")
async def greet_world(req, resp):
    resp.text = f"Hello, world!"

if __name__ == '__main__':
    api.run(port=3000)

That exactly causes the problem you reported.

$ grafana-wtf info --format=yaml --grafana-url=http://localhost:3000/
Traceback (most recent call last):
  File "/Users/amo/dev/panodata/sources/grafana-wtf/.venv/bin/grafana-wtf", line 8, in <module>
    sys.exit(run())
  File "/Users/amo/dev/panodata/sources/grafana-wtf/grafana_wtf/commands.py", line 239, in run
    log.info(f"Grafana version: {engine.version}")
  File "/Users/amo/dev/panodata/sources/grafana-wtf/grafana_wtf/core.py", line 316, in version
    return self.health.get("version")
AttributeError: 'str' object has no attribute 'get'

On all other occasions, for example when obtaining an obvious wrong URL, grafana-wtf will complain better, by informing concisely what the error is about, right?

$ grafana-wtf info --format=yaml --grafana-url=http://localhost:3000/foo
2024-03-31 00:43:00,734 [grafana_wtf.commands                ] INFO    : Grafana location: http://localhost:3000/foo
2024-03-31 00:43:00,744 [grafana_wtf.core                    ] INFO    : Response cache will expire after 3600 seconds
2024-03-31 00:43:00,746 [grafana_wtf.core                    ] INFO    : Response cache database location is /Users/amo/Library/Caches/grafana-wtf.sqlite
2024-03-31 00:43:00,768 [grafana_wtf.core                    ] CRITICAL: The request to http://localhost:3000/foo/api/health failed: Client Error 404: Not Found

The upcoming version will also convey that status correctly through the exit code of the program invocation.

$ echo $?
1

Regarding the issue about the obfuscated error message AttributeError: 'str' object has no attribute 'get', we will need to come up with a fix, also including it into the upcoming version of grafana-wtf. Thanks again for reporting that flaw.

With kind regards, Andreas.

amotl commented 8 months ago

That patch may improve the situation.

This will be the improved behaviour of the program when hitting such a situation like you described.

$ grafana-wtf info --format=yaml --grafana-url=http://localhost:3000/
CRITICAL: The request to http://localhost:3000/api/health failed: Invalid response, content was: Hello, world!
$ echo $?
1
amotl commented 8 months ago

Dear Leonid,

grafana-wtf 0.19.0 has been released, including corresponding patches to improve this edge-case situation. If you may be able to validate the new version in the same scenario which caused hiccups for you, we will be all ears to learn about the outcome. Thanks for your support!

With kind regards, Andreas.

amotl commented 2 months ago

Hi again. We hope this issue can be closed without any oversights. Thanks again for your contributions!

If you can still observe any problems in this regard, feel free to re-open or create a separate ticket. Thanks!