voxpupuli / puppetboard

Web frontend for PuppetDB
https://pypi.org/project/puppetboard/
Apache License 2.0
710 stars 237 forks source link

Stop using the default / completely random SECRET_KEY value #721

Closed TwizzyDizzy closed 1 year ago

TwizzyDizzy commented 1 year ago

(Original issue title: Query sometimes returns results, sometimes not)

Describe the bug

Query page sometimes shows a result, sometimes does not, even when sending the same query several times.

Javascript console doesn't show obvious errors... it is known to occur in several OS/Browser scenarios (Windows, OS X, Chromium, Firefox).

What I did notice though: in cases that do not return a result, the PuppetDB access log shows one query, while in cases that show a result, the PuppetDB access log shows two queries:

Not showing results

0.0.0.0 - - [21/Oct/2022:11:53:39 +0200] "GET /pdb/query/v4/environments HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -

Showing results

0.0.0.0 - - [21/Oct/2022:11:53:49 +0200] "GET /pdb/query/v4/environments HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -
0.0.0.0 - - [21/Oct/2022:11:53:49 +0200] "GET /pdb/query/v4?query=environments%7B%7D HTTP/1.1" 200 273 "-" "puppetboard/4.1.1 (r/2.28.1)" 3 -

Should be easily reproducable by simply sending the query environments{} multiple times in a row... if nobody can reproduce this, it must be an error on our side, then I'd need to dig deeper on my side... so any feedback is appreciated.

Puppetboard version

4.1.1

Environment and installation method

settings.py

      import os

      PUPPETDB_HOST = '%{facts.networking.fqdn}'
      PUPPETDB_PORT = 8081
      PUPPETDB_SSL_VERIFY = '/etc/puppetlabs/puppet/ssl/certs/ca.pem'
      PUPPETDB_KEY = '/opt/puppetboard-venv/private.pem'
      PUPPETDB_CERT = '/opt/puppetboard-venv/public.pem'
      PUPPETDB_TIMEOUT = 20
      DEFAULT_ENVIRONMENT = '*'
      SECRET_KEY = os.urandom(24)
      DEV_LISTEN_HOST = '127.0.0.1'
      DEV_LISTEN_PORT = 5000
      DEV_COFFEE_LOCATION = 'coffee'
      UNRESPONSIVE_HOURS = 24
      ENABLE_QUERY = True
      LOCALISE_TIMESTAMP = True
      LOGLEVEL = 'debug'
      NORMAL_TABLE_COUNT = 250
      LITTLE_TABLE_COUNT = 50
      TABLE_COUNT_SELECTOR = [10, 25, 50, 100, 250, 500, 1000]
      OFFLINE_MODE = True
      ENABLE_CATALOG = True
      OVERVIEW_FILTER = None
      GRAPH_TYPE = 'donut'
      FAVORITE_ENVS = ['production','test','development']
      GRAPH_FACTS = ['architecture',
                     'clientversion',
                     'domain',
[...]
                     'sudoversion',
                     'tier' ]
      INVENTORY_FACTS = [('Hostname', 'fqdn'),
                         ('IP Address', 'ipaddress'),
                         ('OS', 'operatingsystem'),
                         ('OS-Release', 'operatingsystemrelease'),
                         ('Architecture', 'hardwaremodel'),
                         ('Kernel Version', 'kernelrelease'),
                         ('Puppet Version', 'puppetversion'), ]
      REFRESH_RATE = 30
      DAILY_REPORTS_CHART_ENABLED = True
      DAILY_REPORTS_CHART_DAYS = 8
      SHOW_ERROR_AS = 'friendly'
      CODE_PREFIX_TO_REMOVE = '/etc/puppetlabs/code/environments'
gdubicki commented 1 year ago

Hi, thanks for reporting this @TwizzyDizzy!

Can you please try generating a random string once and setting it in your config under SECRET_KEY?

I think that our default os.urandom(24) is wrong as it generates a different value for each app process and probably you are running 2+ processes of the app with uWSGI, so when you make a query request from a page generated by one process and it goes to another process, the secrets mismatch..

TwizzyDizzy commented 1 year ago

That sounds entirely possible! Indeed I have two WSGI processes running.

Also read up on what that static key is actually about (CSRF prevention) and after reading it, your answer makes sense even more.

Thanks for the hint, will give it a shot and report back tomorrow.

Cheers Thomas

TwizzyDizzy commented 1 year ago

I have set the SECRET_KEY statically and now both processes use the same key, hence the error disappeared.

Indeed, I'd say that the following default is... at least misleading and should be commented upon accordingly:

Want a PR for that?

Cheers Thomas

gdubicki commented 1 year ago

Glad that it helped!

A PR would be nice, but I am not sure how to deal with this in a best way. I am afraid that a comment might go unnoticed. Perhaps assuming that the Puppetboard is usually not set up in public we should hardcode a default value? 🤔

TwizzyDizzy commented 1 year ago

I am afraid that a comment might go unnoticed.

Yep, also thought about this and you're probably right. But then again, setting the same SECRET_KEY everywhere would solve the immediate problem but would also be "insecure by default" (I know, very limited in scope, but nonetheless) because then probably nobody will adjust the default either. Catch 22 really.

You could automatically generate it, but then you would run into the problem that you would need to generate a (locally) stable key across (local) instances [1] which kinda works against the randomness requirement.

So the options are:

I guess the minimal impact really is to document it and keep the os.urandom(24) as a default and document properly, that you need to set this statically if you're running > 1 WSGI processes.

[1] imagine a scenario where you want to have a redundant Puppetboard setup with different load balancer backends on different hosts. Then all instances on ONE instance would have a stable SECRET_KEY, but another backend will have different ones leading, again, to the same issue but one abstraction level above :D

Cheers Thomas

gdubicki commented 1 year ago

Yes, I’ve had a similar thought process. :)

But I realized that the current default is also bad even with a single process of the app. It generates a problem like yours when the app restarts. And restarts may be happening often with harakiri in uWSGI or if the app is running in k8s.

Perhaps we should remove the default value and require everyone to set their own secret? The only downside is that technically it would be a backward-incompatible change, so we should uptick the app’s major version and it feels wrong with such a minor change..

gdubicki commented 1 year ago

Maybe better docs and an appropriate warning in the logs if the default is used? We could use a semi-random default, f.e. with a „default-” prefix which would trigger the warning.

TwizzyDizzy commented 1 year ago

But I realized that the current default is also bad even with a single process of the app. It generates a problem like yours when the app restarts. And restarts may be happening often with harakiri in uWSGI or if the app is running in k8s.

Indeed, didn't think of that.

Perhaps we should remove the default value and require everyone to set their own secret?

Yeah, thought about that, too, though it seemed a bit too unwelcoming to new users. But in the end it's the most sane solution, all things considered.

The only downside is that technically it would be a backward-incompatible change, so we should uptick the app’s major version and it feels wrong with such a minor change..

It is breaking. And if the rule is to major-bump on backwards-incompatible changes, that is how it needs to be, despite of any feeling (even though I share it). Sometimes a big initial error (in hindsight, I mean - putting the os.urandom there in the first place) can only be changed by a big, corrective action (major-bump, breaking change).

Maybe better docs and an appropriate warning in the logs if the default is used? We could use a semi-random default, f.e. with a „default-” prefix which would trigger the warning.

That seems like a half-baked solution in order to not to make a proper decision. Docs, yes, but adding additional code to work around a breaking change seems undecisive. After all, reading the changelogs of software you use should not be optional :)

Cheers Thomas

TwizzyDizzy commented 1 year ago

Closing. Has been concluded with the release of 5.0.0. (https://github.com/voxpupuli/puppetboard/tree/v5.0.0). The default SECRET_KEY is now empty, forcing the user to set it explicitly.

Cheers Thomas