Closed dolph closed 6 years ago
We discovered this by attempting to run OpenStack CI suite with Fernet. The result is a high failure rate in our gate (albeit transient... occasionally a job will pass). The only alternative solutions we've identified are either to implement a 1-second client side wait before performing an operation that will result in a revocation event so that the client doesn't attempt to create a new token "too fast," or to implement a similar 1-second wait on the service side before allowing HTTP requests resulting in revocation events to be returned to the client (thus allowing the client to proceed creating new tokens).
The first solution is obviously objectionable to all our clients.
The second solution actually caused our CI job to timeout due to the number of 1-second waits introduced to the test suite.
The lack of precision on the creation timestamp (currently 1 second) introduces the possibility for a race condition when evaluating tokens against revocation events. If you only care about TTLs when validating tokens, then this won't affect you.
For example, imagine an event occurring that might invalidate previously issued tokens, such as a password change or a logout operation at
t=0.1
seconds (note that outside the world of Fernet, we can measure time more accurately than whole seconds!). Then, the user immediately creates a new Fernet token att=0.2
seconds. Unfortunately, that timestamp is only recorded with a precision of 1 second in the Fernet token and thus appears to have been created att=0
(suddenly it was created before the revocation event occurred).However, when comparing the token's rounded timestamp (t=0) to the event's timestamp (
t=0.1
), we're forced to use the least precise data we have, and thus we must compare the token's rounded timestamp (t=0
) to a rounded timestamp of the revocation event (t=0
). It thus appears that both events occurred at the exact same time (which is not true), and therefore we must mistakenly consider the token to be invalid, even though it was actually created after the revocation event occurred.By increasing the precision of the timestamp in Fernet, you decrease the window of opportunity for this race condition to occur until it is not viable to reproduce from a client's perspective. With a one second timestamp and application response times well under that, however, that window is currently relatively large.