buildbot / buildbot

Python-based continuous integration testing framework; your pull requests are more than welcome!
https://www.buildbot.net
GNU General Public License v2.0
5.28k stars 1.63k forks source link

cleanupTextFromSecrets replaces words that are not secrets #3750

Open hweemiin opened 7 years ago

hweemiin commented 7 years ago

When using randomly generated words as password, the new secrets management system in BuildBot 9 will obfuscated words in logs that are not password, for example, if the password happens to be 'echo', the 'echo' in the command-line is obfuscated as well. This can be illustration using following unittest:

class TestInterpolateSecretsIncorrectObfuscation(unittest.TestCase):
    def setUp(self):
        self.master = fakemaster.make_master()
        fakeStorageService = FakeSecretStorage()
        fakeStorageService.reconfigService(secretdict={"foo": "echo",
                                                       "other": "value"})
        self.secretsrv = SecretManager()
        self.secretsrv.services = [fakeStorageService]
        self.secretsrv.setServiceParent(self.master)
        self.build = FakeBuildWithMaster(self.master)

    @defer.inlineCallbacks
    def test_secret(self):
        command = Interpolate("echo %(secret:foo)s")
        rendered = yield self.build.render(command)
        cleantext = self.build.build_status.properties.cleanupTextFromSecrets(rendered)
        self.assertEqual(cleantext, "echo <foo>") # will fail as outcome is "<foo> <foo>"

I think using plain text search & replace for obfuscation is too naive, would be better to continue support the old obfuscating system in BuildBot 8?

tardyp commented 7 years ago

Hi @hweemiin,

Indeed, in this design, it is required that the password is random enough so that it is not equal to something that might appear in other normal clear text output. It is a common measure to avoid choosing dictionary words as passwords, so I think it is a reasonable assumption. However I admit that I didn't realize this usecase, and think it should be documented.

Compared to previous buildbot eight design, the new system has the advantage of also hiding the secrets if the secret is output by some reason by your script:

e.g a message from git like

cloning from https://user:3kljsdliu@gitserver.example.com/myrepo

will be correctly cleaned up from password, if you are using the new secrets system.

Also, the old secret system was only working for shell commands in the list form, not the one in the string form (which needs to be passed to a shell). Supporting string form would have required a newer version of worker protocol to support that which I wanted to avoid.

hweemiin commented 7 years ago

In this case isn't it still better to keep the old secret system for different needs until a better system is designed to cover all use cases? E.g. users not using random words password and want to hide the password from other sources will use the new secret system, users with needs to better control the obfuscation can continue to use the old system?

A more generic solution to this problem might be introducing 'log pre-process hook' or 'IRenderable post-process hook' mechanism so that users can defined their own obfuscation strategy?

tardyp commented 7 years ago

The old obfuscation system is still there, and used by the source steps.

You could try to see if you can integrate the secret fetching system with it. You will probably need to use your own Renderable that generates (secret, placeholder) tuples instead of one that returns the secret, and configure the build's secret removal system (implemented here https://github.com/buildbot/buildbot/blob/master/master/buildbot/process/properties.py#L221).

hweemiin commented 7 years ago

The old obfuscation system is currently broken due to: https://github.com/buildbot/buildbot/issues/3510 And https://github.com/buildbot/buildbot/pull/3519 seems to hint that it will not be fixed.

So for short-term workaround I might revert these 2 changes on my custom version of BuildBot, for long-term solution I will need more time to think about it.

tardyp commented 7 years ago

The old obfuscation system is currently broken due to: #3510

Arg! Indeed I forgot about that. I am not sure if you can easily revert that change. Let me know if it works.

I still have hard time acknowledging that it is worth to support this usecase. If this i really important for you, and you propose a good solution, I am completly open top revert #3519 and re-support old-style tuple based obfuscation.

hweemiin commented 7 years ago

Just to share another use case I found that break the cleanupTextFromSecrets, it's when the password contains characters that need to be escaped in commandline, e.g. password '2>1' will become '2^>1' in commandline log thus will not be obfuscated.

tardyp commented 7 years ago

@hweemiin I am not sure I understand your usecase what is escaping the password? is buildbot escaping it?

hweemiin commented 7 years ago

Yes, the buildbot escape special characters in the commandline argument, for example, if I have a ShellCommand with command = ['echo', 'my password is ', '2>1'], the log of the ShellCommand will be 'echo my password is 2^>1'.

tardyp commented 7 years ago

Yes, the buildbot escape special characters in the commandline argument, for example, if I have a ShellCommand with command = ['echo', 'my password is ', '2>1'], the log of the ShellCommand will be 'echo my password is 2^>1'.

oh? this is weird, because the list form is not supposed to be interpreted by a shell (except on windows). I don't know where this escape code could be.

hweemiin commented 7 years ago

Yes, I'm running BuildBot on windows with command list, the escape was done by the runprocess.win32_batch_quote() function.