Increase recovery codes entropy

Base32 is currently being used to generate recovery code. Therefore, the alphabet that makes up the generation of the recovery code is 32 characters (from A to Z and from 2 to 7). As the key size is 10 characters, the total number of keys possible in this alphabet is is 32^10.

The calculation of the entropy "x" is calculated as follows: x = log(32^10)/log(2) = 50 The entropy of current code is 50 bits.

This entropy is below the NIST recommendations (below 64 bits of entropy, a rate-limiting mechanism would be required) https://pages.nist.gov/800-63-3/sp800-63b.html
This entropy is below the ANSSI (french cybersecurity agency) recommendations (82 bits of entropy is required for medium strength password) : https://www.ssi.gouv.fr/administration/precautions-elementaires/calculer-la-force-dun-mot-de-passe/

Because recovery codes MUST be kept simple to read and enter by end user, it is important to keep a simple character set, without being very long. Suggestion for alternative recovery code implementation :

use a larger set of characters, yet still simple to read/enter : numbers and lower case characters from latin alphabet (for a total of 36 possible characters)
increase code length, from 10 to 16
continue to split code in groups, separated with dashes

The entropy using those settings would be log(36^16)/log(2) = 82.7, matching security requirements. This would generate recovery code like "phuo-1dm4-647k-8i2n". This same format is for example being used by DropBox for its MFA.

It could be argued that 0 and 1 should be removed from the character set to prevent confusion with O and L/I when read/entered. However this would reduce the entropy to 81, unless length is increased to 17 or more. I think with a proper font this is not required.

(Incoming pull request)

samdjstevens / java-totp

Increase recovery codes entropy #22