Secret Token and ASCII encoding of headers

beniwohli commented 7 years ago

While there is some support for character encoding in headers, anything but ASCII/Latin1 should be avoided due to possible interoperability issues.

As we include the secret token in the Authorization header, this impacts us. We should either limit the token to the ASCII character space, or introduce an encoding ourselves, e.g. base64.

If we go with base64, we can either decide to Just Do It ™️, without any regards to backwards compatibility (which might be a concern, as this won't make it into alpha1), or use a prefix (e.g. base64: to indicate wether a token is encoded or not.

simitt commented 7 years ago

@ruflin would you say we should stay backwards compatible also for an alpha release or can we break it?

@watson what are your thoughts on this?

watson commented 7 years ago

I don't worry to much about breaking stuff while in alpha. But is there a security concern with limiting us to Latin-1?

ruflin commented 7 years ago

@simitt As long as we are in alpha we can keep breaking stuff but we should not it in the Changelog.

simitt commented 7 years ago

I don't think we should restrict a secret to Latin-1 when we can base64 it, which should not be too much effort.

simitt commented 7 years ago

@watson is it ok with you to use base64 encryption? Then I'd go and implement it server side..

watson commented 7 years ago

I'm concerned about how this makes it extremely hard to make requests with curl if you're using a token. Most characters people are able to type on a western keyboard are within the allowed character set if I'm not mistaken. How often do we think this will be an issue and is it worth sacrificing curl for this? But I'm probably pretty biased because I don't use Cyrillic characters in my passwords normally 😅

simitt commented 7 years ago

Yes I thought about this. However, you basically just have to run one simple script to output a base64.encode("Bearer myToken") before using curl. And we allow for more security.

simitt commented 7 years ago

@ruflin , @jalvz , @roncohen any input on this?

watson commented 7 years ago

I don't think the possibility for more permutations of the token should be the reason for us to change this. There's over 200 different characters in Latin-1 that can be used for the token. That should give us amble security I think. The real issue - if I understand this correctly - is the risk of people using unsupported characters by accident which might leave to weird and hard to debug errors.

jalvz commented 7 years ago

I'd just limit tokens to Latin1 without any encoding, but don't have a strong preference.

ruflin commented 7 years ago

I quite like the idea of using base64 for encoding and decoding so we don't have to apply any restrictions as far as I know. Most users would not even see the difference, only the once that use curl to interact with our server.

@watson For curl we could have an example with something like $(echo -n "token" | base64) (didn't test) in our docs for people to use.

watson commented 7 years ago

@ruflin Good idea. I just made a simple test. It depends a lot on the shell you're using but with zsh this works for me:

curl http://localhost:8200/v1/transactions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer `echo -n 'my-token' | base64`" \
  -d @docs/data/intake-api/generated/transaction/payload.json

It's important to use double quotes, else the code inside the pings will not execute and the type of shell you're running is important. So it's a lot more brittle. But maybe there's a version that works in most/all shells?

ruflin commented 7 years ago

@watson What is the reason you are concerned about curl usage? I assume this is mainly for debugging testing? If yes, perhaps we could include some debugging / testing tools /commands in the agents itself?

watson commented 7 years ago

@ruflin probably just old habit. In Opbeat we used to tell our users to curl our API to track releases (e.g. using a git post-receive hook). So I just wanted to raise the issue to be sure we thought about the curl use-case as well. If y'all are fine doing base64 with that in mind, I'm good as well.

beniwohli commented 7 years ago

what about the base64: prefix? If it's there, the server decodes it, and compares it with the configured value. If not, it assumes ASCII encoding, and compares it.

There's a lot of talk of Latin1 in this issue (probably caused by me mentioning ASCII/Latin1 in the 1st message). However, the HTTP/1.1 RFC suggests that only ASCII should be used:

Historically, HTTP has allowed field content with text in the
ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
through use of [RFC2047] encoding.  In practice, most HTTP header
field values use only a subset of the US-ASCII charset [USASCII].
Newly defined header fields SHOULD limit their field values to
US-ASCII octets.  A recipient SHOULD treat other octets in field
content (obs-text) as opaque data.

watson commented 7 years ago

Good point about US-ASCII @beniwohli. I just want to add that US-ASCII still gives us 95 printable characters to choose from (same as when cranking the 1Password password generator to the max). I.e. this should still be enough for making it secure - so this should only be a usability question IMHO 😃

beniwohli commented 7 years ago

It's less about available entropy and more about what users actually will type in. When I mash my German keyboard, I end up with a couple umlauts. That will explode in my face.

Alternatively, we could do some client side validation to ensure that the secret token is encodable as ASCII.

roncohen commented 7 years ago

I'm voting to stick with cleartext (under SSL of course) as long as we clearly document that when choosing a secret token, that it needs to be ascii/latin-1.

base64 will complicate things, as you already discussed there's a lot of value in being able to simply shoot of a curl

simitt commented 7 years ago

@beniwohli @watson if sticking to not using base64 there is no change to be implemented on apm-server side, since the agents should already take care of this. If ok with everyone I am going to close this issue then.

elastic / apm-server

Secret Token and ASCII encoding of headers #177