barrysteyn / node-scrypt

Scrypt for Node
370 stars 88 forks source link

Scrypt Version 5 #72

Closed barrysteyn closed 9 years ago

barrysteyn commented 9 years ago

Scrypt Version 5.0

@fresheneesz @ironwallaby @BrandonZacharie

This issue is to address #67, #68, #69, #70 and #71. Its outcome is to produce a consistent API that is "JavaScripty". It will not be backward compatible (hence the major version increment). This ticket is here for us as a community to discuss and have a say regarding the future of this module. As such, I will not start developing until I feel that there has been an adaquate debate. At the very least, I will leave this up for at least a week. The outcome of this issue will be many "children" issues that I will use to alter the module.

What does everyone think of putting the new design in a wiki so we can all have input?

It will consist of the following:

Drop legacy support

The first version of the library is so old it really does not need to be supported anymore. Seeing as version 5.0 will not be backward compatible with previous versions, this is a good time to drop support for the previous version:

A global configuration object will be dropped. This will produce a more consistent and expected result when modules are cached by reducing side effects.

Callback functions consistent

All callback functions will have the following consistent signatures:

function(errObject, result)

All errors objects will be of type Error.

Throwing Exceptions

When dealing with asynchronous callbacks, this can get a bit hairy. Looking at this document, it proposes that programmer errors are thrown as exceptions in asynch (or normal sync) functions. Async is normally where the debate comes in.

Any programmer error will be thrown as an exception (even in the async case). For async, we don't want to start the infrastructure of an async call when we already know that there is an error. Rather, throw an exception immediately.

Error descriptions

Currently, it has been pointed out that the error descriptions returned are obtuse. This is because the errors are trying to mimic the original Scrypt C library. I am up for suggestions from the community for making them easier to understand.

Buffers

One crucial step towards the above goal will be for all inputs (keys, hashes, salts, outputs) to be of type Buffer. This will remove the need to configure an output's transcoding. Speaking of transcoding, lets just leave that to the Buffer in the end (but document for people to understand quickly). If a string is presented instead of a buffer as an input, it will automatically be cast as a buffer.

Params Object

The params object is an attempt for Scrypt's internal config to be altered in a human "understandable" manner. As it stands now, I think it is okay. So I am going to leave it be

Hash

The hash function will keep the same API, but as mentioned above, there will be no config for setting key and output encoding. Instead, both key and output will be of type buffer:

hash(key, scrypt_parameters, callback_function)

Verify

The verify function will be as it is now, except no config (all inputs assumed to be buffer)

KDF

The key derivation function is where most of the changes will occur. Without a global object, everything can be assumed to be of type Buffer. The KDF api is:

kdf(key, scrypt_parameters, outputLength, salt, callback_function)

Currently, if no salt is provided, then a 32 bit salt will be randomly generated. The size of this salt can be set in the global config (it defaults to 32 bits). Since we are removing this global object, I am up for suggestions regarding how this default salt size should be set. One option is to have a default size as an input parameter. This will be ignored if a valid salt object is presented. Again, suggestions are welcome.

Improve Verify function

It has been pointed out that the verify function is not consistent. #71 states that the verify object throws an exception instead of returning false. I am sure that this was due to a programmer error (as mentioned above, these exceptions are meant to be thrown). Still, it is confusing. The verify function will therefore aim not to throw any exceptions and instead, just return a boolean.

Alter Hash length

Should we give the ability to alter the length of the hash? Currently, the has is constrained to a 32 bit HMAC. Perhaps we should allow an increase in size? I don't know if this feature is a needed or wanted, so lets discuss here.

Change object "functors" to real functions

As pointed out in #68, the functions are not real functions, instead you can think about them as objects with the () operators overloaded. I will change these to proper functions so that there are not surprises

What else?

Please provide any other feedback and discussion here. If I have left anything out, then please shout out.

ghost commented 9 years ago

Yay!

Otherwise, the proposed changes look good to me so far.

barrysteyn commented 9 years ago

What about how the default salt size is handled in KDF?

ghost commented 9 years ago

The simple option is to pass in an "options" object:

kdf(key, scrypt_parameters, [options,] callback_function)

The options object is optional. "outputLength", "salt", and "saltLength" are all valid options to appear in that object, and sane defaults are taken if they are not present.

tellnes commented 9 years ago

I just want to confirm we are on the same page on throwing errors.

Programing errors should be thrown immediately, but if that is not possible, the error should be given to the callback. In other words, throwing should only happen synchronously.

Eg. this is ok:

exports.verify = function() {
  throw new TypeError('Callback is required')
}

while this is not:

exports.verify = function() {
  process.nextTick(function() {
    throw new TypeError('Callback is required')
  })
}
ghost commented 9 years ago

@tellnes +1

fresheneesz commented 9 years ago

error descriptions

I'd do this as best effort. We should make a cursory attempt to figure out what internal errors can happen, and modify their messages to be more understandable when necessary. Any ones we missed we can address if someone comes across them organically. As a general note about errors, I just want to mention that if any error message is not a constant string (like you put a piece of the user's input in the message or something) then a constant property should be added to the error. This is so that if someone needs to do different things depending on the error message, they can reliably programatically detect which error it is.

I agree with ironwallaby, errors should strongly mimic C error constant names, and other things that would make it easier for people to search for the error. We should add to those error strings to make them understandable where possible tho.

I also agree with tellnes. We shouldn't be throwing errors directly into the scheduler.

If a string is presented instead of a buffer as an input, it will automatically be cast as a buffer.

Strings will always be utf8 encoded right? If there's any possibility someone programming correctly could end up with strings in different encodings, automatically casting might not be the right thing to do. I'm fairly confident that you'd have to incorrectly decode a buffer to get a non-utf encoded string, tho - so this should be ok.

As it stands now, I think [the params object] is okay

I agree - I like that it provides a nice understandable interface to generating the standard scrypt parameters. It also gives a way for people to play around with generating the scrypt parameters to understand them better if they want to.

The key derivation function is where most of the changes will occur

KDF is essentially an internal API right? The docs currently say they're exposed for experienced users using it for other business logic. Because its intended only for experienced users, and there's no reason for someone using basic scrypt to use it, I would suggest not having any optional parameters - make everything explicit.

Alter Hash length

If the scrypt spec allows for different sized HMAC lengths, I would vote for supporting that. I'd vote that this library be as general an implementation of the spec as possible, meaning that it should allow anything not disallowed by the spec.

Additional recommendations

drop synchronous support

Synchronous stuff blocks the thread - which is the only thread your javascript interpreter gets. I think we should completely drop synchronous support, because otherwise newbs will use it not knowing that it completely destroys performance. If people want a synchronous API, the they should use something like node-fibers which allows for synchronous coding without blocking the thread.

Not only is this in line with the primary advantage node has (its superior performance in handling asynchronous actions), but it also makes the API more consistent. If we drop synchronous support, we should have every error reported via the first callback argument (no throws).

ghost commented 9 years ago

@fresheneesz I also agree that if synchronous support can be dropped, it should be dropped. It is never a good idea in Node, even if there are (rare) use cases where it is convenient.

barrysteyn commented 9 years ago

Hi guys. Believe it or not, I am the use case of someone that needs synchronous support. I think it comes in really handy when you are messing around and trying to model things. I suspect that any real production usage would not need this, but I'd like to leave it in there for people like me who are not using this on production.

barrysteyn commented 9 years ago

@fresheneesz It turns out the KDF is probably more used than the hash function. The KDF is essentially the Scrypt algorithm. Any application that needs this logic to build things upon (for example, Litecoin) needs to use it. The hash really just wraps the KDF in a HMAC. Anyone who needs to use a pure cryptographic hash function that is Scrypt will be using the KDF.

barrysteyn commented 9 years ago

@ironwallaby I have no idea why I was so against passing config objects into these functions. But it makes total sense (and it can be easily set to documented default values so that it could be optional). I would actually be inclined to go down this route. In fact, I would imagine that we could even keep the API identical to the current version. This brings up the question: Is there anything that people inherently do not like regarding the current API function signature (assume though that there is no global config and buffers are used for all input).

@fresheneesz @tellnes @ironwallaby Would be good to have your input.

fresheneesz commented 9 years ago

I am the use case of someone that needs synchronous support

I can agree that synchronous methods are convenient for messing around with things in a REPL, but you could say that about any asynchronous function. Should every function have a synchronous version cause its convenient in a REPL? I'd say that's not a sufficient reason. If you're not in a REPL, you could utilize node-fibers to turn any asynchronous function safely into a synchronous ones that don't stop the world.

Is there anything that people inherently do not like regarding the current API function signature

I'd say with the changes you mentioned, i'm mostly fine with the current API.

ghost commented 9 years ago

Regarding keeping the current API but with options arguments with sane defaults: that'd work for me, too.

barrysteyn commented 9 years ago

Hi all - please can you look at #73 and make some comments. I will start on this ticket over the weekend.

alex94cp commented 9 years ago

Hi everyone, I'm copy-pasting this from #75 so that we can benefit from more feedback:

The current behaviour of scrypt#verify(async) of triggering an error if password mismatch is not coherent with the sync version. I strongly think that scrypt#verify(async) should not invoke the callback with an error in this case.

An error should be triggered in the case that the password verification has not been able to take place, and therefore information about whether password matches or not is basically undefined. Consider, for example, what would happen if an out of memory error were triggered from scrypt (I don't know if it does, it's just an example): I'd have to check err.scrypt_err_code for each particular error code to know if password match info is usable or not; now adding an error to scrypt would make it forward-incompatible (would need a new major version) and my application would be forever tied to a particular scrypt version.

Even from a design point of view, scrypt shouldn't impose to its consumers that a password mismatch should be an error. It MAY be an error in the consumer application, and it would treat is as such like this:

scrypt.verify(pwHass, passGiven, function(err, matches) {
    if (err)
        return cb(err);
    if (!matches)
        return cb(new Error('password mismatch'));
    return cb(null, userInfoOrSomething);
});

But it may be not an error, like in my use case, where a password mismatch does not return an error, and instead triggers an onAccessDenied callback, where the user decides what to do (ie: treating it as anonymous). With the current design, I'm forced to use scrypt like this:

scrypt.verify(pwHash, passGiven, function(err, matches) {
    if (err && err.scrypt_err_code !== 11)
        return cb(err);
    return matches ? onAccessGranted(userInfo) : onAccessDenied(credsGiven);
});
barrysteyn commented 9 years ago

@alex94puchades That sounds like a good suggestion. @ironwallaby @tellnes @fresheneesz What do you guys think?

Also, while everyone is listening :), I aim to get version five out the door by June 15th.

ghost commented 9 years ago

I agree: I do not consider that a password mismatch to be an error. (An error should be an exceptional condition when something goes very wrong; password mismatches are commonplace and should be considered so!) Returning an error (for exceptional problems) and a boolean (for, when no error occurred, whether the password verifies or not) is the API I'd like to see as well.

fresheneesz commented 9 years ago

I definitely agree that the verify function should return (to the callback) a boolean of whether or not it matches. I actually thought that was already part of this next release.

barrysteyn commented 9 years ago

Yes, I already thought of that. You can definitely count on that feature being present. In fact, promises are what make node worth while for me. On May 24, 2015 9:52 PM, "Michael Phan-Ba" notifications@github.com wrote:

For synchronous support, it would be nice to return ES6 Promises (when available) if no callback is given. This would enable support for co https://github.com/tj/co and support ES7 await https://github.com/lukehoban/ecmascript-asyncawait in the future.

— Reply to this email directly or view it on GitHub https://github.com/barrysteyn/node-scrypt/issues/72#issuecomment-105125104 .

tellnes commented 9 years ago

+1 for wrong password not being an error.

barrysteyn commented 9 years ago

Hi All

I just want to give everyone an update, and also ask for some advice. First an update: I am making good progress. I hope to have this finished by 16 June, but if that does not happen, then I will only be able to work on it again in 1 July (going on vacation for two weeks).

With regards to async/sync, I am definitely going to have two different versions of the functions, the one names Sync for synchronous (this is due to popular demand). For example:

Async: params(...function {});
Sync: paramsSync(...)'

Also, promises are amazing. Seeing as this is a total rewrite, I am going to make the whole library promise aware. So params could be also be invoked like this: params(...).then(...) - you guys get the point. I have used promises quite extensively with scala, but I have not used them that much with JavaScript. So is there any type of convention that I am missing?

Now for some collective discussion. When I originally started this library ages ago, I made a mistake in the naming which has been bothering me ever since. Changing it will break lots of existing things and cause confusion. Nevertheless, I think it is time to do it. You see, the KDF function stands for key derivation function, but actually, this is not correct. What this is currently doing is returning the raw scrypt hash, and should actually be called hash. In fact, what is currently called hash is actually the key derivation function and should be renamed to KDF. So I want to swap names. There will obviously be documentation that points this out.

What do you guys think?

ghost commented 9 years ago

Fine by me. This is already a major version bump, so breaking backwards-compatibility is reasonable.

fresheneesz commented 9 years ago

@barrysteyn Looks good. Definitely use a promise/futures library rather than creating one (if that was what you might have been thinking). Make sure to use one that has a 'done' method, since that's important for not losing unexpected errors.

tellnes commented 9 years ago

I'm not (jet) a promise user and personally don't want it. But I see that it is a demand for it and is not going to argue against it. Async/await which comes in es7 is probably going to change my mind about it.

But, please don't use require("es6-promise").polyfill();. It polyfills the global environment and not just your library.

Also, please consider that you are basically adding a new dependency to your library just to create some sugar.

alex94cp commented 9 years ago

Why use typeof Promise != "undefined" and not simply Promise !== undefined? Isn't comparing strings slower?

BrandonZacharie commented 9 years ago

I agree with @tellnes but believe promises are a bit too much for this library for many of the same reasons Ryan removed them in the Node v0.2 era. Many, if not most, of us just use passwordHash and verifyHash in a straight forward way. Personally, I prefer this library remain minimalist and leave the sugar to user-land; it's easy enough for users to promise-ify things themselves or create a library that does so.

barrysteyn commented 9 years ago

Hi Guys

@BrandonZacharie Promises are not sugar, infact, without them, I would have the exact same reaction as TJ Holowaychuck did. I promise you (pun definitely intended), they make life a lot easier.

That being said, I don't intend to use a promise library, because I intend this to only work JavaScript engines that have implemented ES6. So promises would only work on IOjs until it gets merged into Node. @fresheneesz was this the reason you asked me to use a library - were you scared I was going to roll my own?

fresheneesz commented 9 years ago

@barrysteyn rolling your own would be the wrong way to go - yeah. Given that node.js doesn't yet have ES6 promises, and lots of people still use versions of node that aren't the latest, I'd disagree with requiring ES6 for this. Like Brandon said, its easy enough to promise-ify your standard node errbacks.

Jumping the gun and requiring the newest version of node/IO.js, or worse, requiring people to wait for a new version of node, will cripple adoption of this module. The agenda for this module should be a good interface for scrypt - the agenda should not be to push for the latest and greatest control-flow standard.

I'm a big supporter of futures, they're better than errbacks in every single way except for support. Once they're better in every single way, unqualified - then requiring them becomes fine.

barrysteyn commented 9 years ago

Why can't we just detect if the feature is available. That was my plan all along, I did not intend forcing a cutting edge version of scrypt on anyone.

fresheneesz commented 9 years ago

I'm fine with that, tho I don't see much of a reason for branching. What do we gain by using native promises over the es6-promise library? Negligible performance gains? Regardless, its not worth arguing about - that solution is perfectly fine as far as I'm concerned.

barrysteyn commented 9 years ago

Es6 promises is where the language must go. With that in mind, this library is intended for when all these polyfills are redundant. For my own projects, I'm using iojs and promises are the only way I do async. I expect the adoption rate to sky rocket, specially when node has promises natively which will be sooner rather than later.

Lastly, I want to promote promises and get people using iojs or pushing for node to adopt promises natively. Allowing scrypt to use promises is my way of promoting this.

I'm fine with that, tho I don't see much of a reason for branching. What do we gain by using native promises over the es6-promise library? Negligible performance gains? Regardless, its not worth arguing about - that solution is perfectly fine as far as I'm concerned.

alex94cp commented 9 years ago

@barrysteyn I'm perfectly fine with scrypt pushing on the Promises front. However, I think scrypt should be usable before ES6 comes along. Main reason being that this rewrite will be a REALLY needed interface improvement for scrypt, and we can't tie this very basic requirement with ES6, which is, in this case, purely cosmetic. I mean, how complex can code using scrypt be? One callback?

I present a possible solution to make all of us happy: given that you already have settled on providing separate sync/async functions, could you please consider returning a promise ONLY IF no callback given? I think this is the approach all promise-aware libraries should go for. The end-user should be given the freedom to choose between promises flexibility and manageability and good-old callback style. Bear in mind promises are not the only possible flow-control paradigm, scrypt shouldn't try to impose neither, even if with a laudable purpose.

barrysteyn commented 9 years ago

Yes, this was my intention from the beginning. I believe the module is then considered promise aware

alex94cp commented 9 years ago

Clarified then, I thought you were for a promise-only interface. +1

barrysteyn commented 9 years ago

I really need to do a better job explaining myself :)

barrysteyn commented 9 years ago

Also, even though the call back for scrypt is rather simple, the thing about promises (besides the fact that you can do sane error handling) is that they can be chained together. On it's own, scrypt would probably not be too enhanced by making it accept promises. It's when there is a need for it to be chained with other promises that this will be useful.

That being said, the module will be designed to gracefully degrade to normal callbacks should promises not be available (I think I need to emphasize this fact)

BrandonZacharie commented 9 years ago

If I understand correctly, we will, for example, have 4 signatures for scrypt.kdf.

static func kdf(key: String, parameters: KDFParameters) -> Promise { ... };
static func kdf(key: String, parameters: KDFParameters, options: KDFOptions) -> Promise { ... };
static func kdf(key: String, parameters: KDFParameters, callback: KDFCallback) -> Void { ... };
static func kdf(key: String, parameters: KDFParameters, options: KDFOptions, callback: KDFCallback) -> Void { ... };

This looks fine by me. I would like to add that, with ES6 down the street, I don't see a need to force the issue with offering promises to those without native support; if they really want it, they will turn it on, switch to io.js, or upgrade in the near future.

tellnes commented 9 years ago

I'll just want to add a reference to the statement / stance from io.js TC on exposing a Promises API [1].

https://github.com/nodejs/io.js/blob/dcf73cafddad745bc7833446ac9075f52dca194d/doc/tc-meetings/2014-12-10.md#statement--stance-from-tc-on-exposing-a-promises-api

I'm not opposed to adding a Promises API, but if it was my module I would have waited. My proposal is to implement it, but document it as experimental.

barrysteyn commented 9 years ago

Hi Christian

Sorry, I seem to be failing at communicating for this module (happens when you have full plate). It is my intention (for now) to label the promise as experimental

tellnes commented 9 years ago

That sounds great. I could have mentioned it, but I have a full plate myself, so I could have missed it also.

barrysteyn commented 9 years ago

Hi All

Apologies for missing the deadline. As a rule, I try to keep my personal life separate from professional/github life, but this time, I am gonna let it slip. You see, while I reside in North America, I am marrying a Spanish girl, and we are in Spain currently trying to organize the wedding for next year. If anyone thinks coding is tough, let me go on record and say that you have not tried organizing a wedding yet. I am back in North America on 1 Jul. I estimate that it will take me another 15 days after that to get version 5 out the door, so lets say mid Jul.

Thanks for putting up with this, but the good news is that I am already quite far, and I will try push to a dev branch to get some feedback (although this will only happen when I get back).

barrysteyn commented 9 years ago

@ironwallaby @tellnes @fresheneesz @mikepb @alex94puchades

Version 5 has just been pushed to a branch appropriately called v5. It has everything except promises (that will come soon). Also, tests and documentation need to be finished. But it is in a workable state and I would like feedback from you guys. Please note that KDF does what Hash did in version 4 (and vice versa).

Okay guys, lets get some feedback going :)

tellnes commented 9 years ago

Just from the readme. I see that you have named the methods with first char as uppercase. That indicates that they are constructors and should be called like this:

var verifier = new Scrypt.Verify(KDF, key, function(err, obj) {})
// Do something with `verifier`
barrysteyn commented 9 years ago

Hi @tellnes

Just pushed an update to v5 branch. I have changed it to lower case for all function calls. Thanks for keeping me legit (as you probably guessed, I do not code in JavaScript every day). If you had compile issues, this new version should work.

tellnes commented 9 years ago

I think the swapping of kdf and hash is going to create some confusion. That needs a big warning in the top of the readme with an explanation.

Also, the old passwordHash and verifyHash which now gets removed should get some documentation. Maybe something like this with the header How to use this library for storing and verifying passwords.

const scrypt = require('scrypt')
const scryptParams = scrypt.paramsSync(0.1)

function updatePassword(user, password, cb) {
  scrypt.kdf(password, scryptParams, function(err, kdf) {
    if (err) return cb(err)
    user.password = kdf
    saveUser(user, cb)
  })
}

function verifyPassword(user, password, cb) {
  scrypt.verify(user.password, password, cb)
}

Btw, I really like that I don't need to check the error argument any more for incorrect password.

barrysteyn commented 9 years ago

@tellnes

Regarding the swapping of kdf and hash, I mentioned this sometime ago. It was agreed that since this was a major version update, it could break previous versions and so it was okay (as long as it was documented). Regarding the documentation, it is hardly finished. That will happen at the very end (and I may ask for some help to move things quicker). I like your idea regarding a section on how to use things.

barrysteyn commented 9 years ago

Hi everyone

As of this morning, the module is now promise aware. If you guys could look at it and test it, I would be very grateful. The promises will work like so:

  1. If using an async function, and no callback is defined, then if promises are available, return a promise.
  2. If there is a callback present, then treat the function like a normal async with a callback, even if promises are available.
  3. If promises are not available, and the async function is called without callback, then throw an error.

The plan is to leave this version up here for another week. If everything is good, then I am going to compile documentation and release. PLEASE PLEASE _PLEASE_ give me feedback if you notice anything that you are unhappy with. If you give me feedback at the last minute or after I have released, it will take much longer to correct.

Thanks everyone

tellnes commented 9 years ago

What about starting with releasing a release candidate to npm?

barrysteyn commented 9 years ago

@tellnes Is that wise? I would think releasing to npm means its a final product? Unless there is a way to tag it as a release candidate...

tellnes commented 9 years ago

Yes, you can tag the release when you are publishing.

I propose you call the first version 5.0.0-rc1 and tag it as next.

barrysteyn commented 9 years ago

Thanks @tellnes. For this to happen, I have to at least just add a little bit to the documentation (or even better, try complete the documentation).