improved unformat(), formatNumber() and created tests

millermedeiros commented 13 years ago

So far only created tests for unformat() and formatNumber(), can be a base for other specs. Cleaned a little bit the structure of those methods as well and added a few helper methods to simplify logic.

Also moved settings to the top of the file since it is easier to find it.

wjcrowcroft commented 13 years ago

Hey, thanks for the request - I'll check these out tomorrow and reply. Oddly enough I just finished writing tests for everything in QUnit based on the underscore.js test suite, but I'll check out the jasmine ones and perhaps include them both if needed.

Cheers!

millermedeiros commented 13 years ago

the advantage I see about using Jasmine is that the fact that you create sentences that describe what the code is supposed to do usually helps me to spot edge cases (since I have to think about it) - but is more a psychological thing that a technical difference..

cheers.

wjcrowcroft commented 13 years ago

I've been over all the proposed changes, there’s a few things I wanted to discuss:

I like the idea of extracting the parameter normalisation out of the API methods to avoid code re-use.

Just one thing - in your normalizeParams method, the params object is extended (where values are missing) by the default (base) object, but the logic you're using (params[key] = params[key] || base[key]) makes it impossible to pass in zero values or empty strings. Consider:

var params = {
    thousand: "",
    precision: 0 
};
var base = {
    thousand: ",",
    precision: 2
};

params['thousand'] = params['thousand'] || base['thousand']; // "," (should be "")
params['precision'] = params['precision'] || base['precision']; // 2 (should be 0)

The logical || operator coerces the value in params to false, and favours the value from base, which is wrong (passing in 0 or and empty string is perfectly reasonable in this case... this should probably use logic like this instead:

params[key] = typeof params[key] !== "undefined" ? params[key] : base[key];

I'd also possibly prefer to call this a standard defaults method (see _.defaults() in underscore.js), eg: config = defaults(config, settings.number); because it may be usable for other things than normalising parameters.

On that note - I'd suggest we keep the precision normalising out of this method, and move it back into the API methods, to keep the idea of a defaults object-extending method that could be reused elsewhere. Feel free to discuss!

In the pull request version, formatMoney runs 20-50% slower than the current versions. I threw together a quick jsperf test for it that includes both libraries and runs the same tests on each: http://jsperf.com/accounting-js-pull-request-8-speed-test

I think the performance slowdown is due to the modification to the way formatMoney does its formatting, and also (in the array tests) the use of the map helper method which carries the overhead of a function call on each formatting... but I may be wrong. Any ideas there as to whether it's the regex?

Regexes are pretty slow when it comes to capturing parts of a string. The old method was ugly, but potentially faster (but I'm very happy to be proved wrong, so please discuss).

I don’t know if we need the helper methods isType, isArray and isObject - I agree that typeof isn't reliable (I've actually been using the toType function, from the article you linked to, in some other projects), but I think the use case in this library isn't strong enough to warrant including these.

typeof array === "object" does seem a bit dumb - but until it creates more problems than it solves, I think it works well enough for accounting.js

I'd also question whether accounting.js truly needs its own map implementation, at this stage. I can definitely see it being useful in the future, but for now, it seems to me the for loops are fine for recursive array formatting and also seem like they would run faster (not calling an extra function on every iteration)

I like the moving of the settings to the top of the library, and I also like any method of cleaning up the way parameters are handled (using a config object)

Let's discuss how to bring some of these changes in - if you could make commits that fix up any of the issues above, that would be awesome.

Thanks again for all the submissions, happy to talk through any of the above.

wjcrowcroft commented 13 years ago

Hey, just an update - I've merged your commits into the dev branch and made a bunch of edits.

Included in the dev branch:

moved settings object to top of library
included the param normalization helper method, but called it defaults for use later in other contexts, and modelled it on underscore's _.defaults so it doesn't override empty values (added jasmine test for this, too)
kept the config object to store parameters (renamed it opts)
kept the precision normalisation helper method, but it's called form the API methods now, not from the defaults helper method
jasmine test suite (added alongside qunit tests, which include speed tests)

For now, I've taken out the following (happy to discuss adding them back in, but wanted to do it on a case-by-case basis with testing):

variable type-checking helper methods (sticking to typeof until it becomes a problem)
map helper function (sticking to for loops until it becomes a problem, for speed and size)
changes to formatNumber formatting method (the regex at line162) - still open to changing it, needs more testing first though.

Thanks for all the work on this - whole thing feels better already. Like I said, I'm happy to discuss including the things I took out, just want to do things step-by-step so I can keep a handle on how the code is growing and developing.

If you have time, please test out the dev branch and see if you can find any errors or omissions, then I'll merge it to the master branch hopefully tomorrow.

Cheers!

millermedeiros commented 13 years ago

yes, you are right. there area many things that I've done that decreases performance (ordered by importance):

RegExp w/ capturing groups is "slow".
normalizeParams() is slower than simple handling the params inside each function (create a new object, if/else inside a loop, property lookup to use values)
isType() needs 2 function calls (shouldn't be a huge concern tho).
map() (shouldn't be a bottleneck, specially in modern browsers)

I tend to favor clean code over performance most of the times, later if I find that there is a need to improve the performance it is easier to refactor and spot the bottlenecks if each concern is abstracted.

I would probably keep map() since it reduces code duplication (it could be also used inside formatTable() which I didn't touched) and abstracts away the loop logic and also keep the isType() checks since they are cleaner to read, but you are right about removing the RegExp. I also liked the defaults() and keeping the precision check outside of it.

Not sure when I will have more free time to contribute with the project but I will try to help whenever possible, maybe I can do something during lunch time today.

Cheers.

wjcrowcroft commented 13 years ago

Hey, thanks for the reply and explanations - I will probably bring the map() function back in in the next minor release, just going to do some tests on it. You're right about formatting tables - could save a lot of code with it.

Re: performance - yeah, I'm not really too concerned - 50000 ops per sec is still blindingly fast, if you think about it - but I'm definitely interested in keeping the library size down so bytes is a consideration, and I'm happy to explore more RegExp options if/when they become more suitable for the functionality.

That's a great article, by the way!

Thanks for your help so far, and yeah if you have any time to check in on the state of the library and see if there's anything you can help out with, I'd definitely appreciate it!

Joss

millermedeiros commented 13 years ago

I did some performance tests on Chrome and Firefox 6 and created 2 branches to test isType() and map():

map() doesn't affect performance that much on those browsers (less than 1%). I've also changed the way isArray() works to use native Array.isArray() if available.

isObject() on chrome is way slower than a simple typeof check (~12%), on Firefox 6 not that much.

PS: I liked the perf tests together with the unit tests, easier to refactor and still make sure everything is working.

wjcrowcroft commented 13 years ago

Thanks for branching these, just taken a look over them.

I put this inline but worth mentioning here - underscore.js has pretty solid methods for type checking:

  // Is a given value an array?
  // Delegates to ECMA5's native Array.isArray
  _.isArray = nativeIsArray || function(obj) {
    return toString.call(obj) === '[object Array]';
  };

  // Is a given variable an object?
  _.isObject = function(obj) {
    return obj === Object(obj);
  };

Especially isObject is pretty smart and simple, preferable to checking against "[object Object]" I think.

For isArray, underscore saves references to all the ES5 properties it would like to use if available:


  // All **ECMAScript 5** native function implementations that we hope to use
  // are declared here.
  var
    nativeForEach      = ArrayProto.forEach,
    nativeMap          = ArrayProto.map,
    nativeReduce       = ArrayProto.reduce,
    nativeReduceRight  = ArrayProto.reduceRight,
    nativeFilter       = ArrayProto.filter,
    nativeEvery        = ArrayProto.every,
    nativeSome         = ArrayProto.some,
    nativeIndexOf      = ArrayProto.indexOf,
    nativeLastIndexOf  = ArrayProto.lastIndexOf,
    nativeIsArray      = Array.isArray,
    nativeKeys         = Object.keys,
    nativeBind         = FuncProto.bind;

Good to hear map doesn't affect the performance, as it definitely keeps things cleaner. If you create a pull request for each branch I can probably auto-merge it and we can discuss more on there if needed.

Cheers!

millermedeiros commented 13 years ago

added a comment on the commit about why I favor toString.call() (like many other people also do)

the isArray() I'm already doing the same as underscore by coincidence (since I didn't checked underscore source code to implement anything)

will consider this thread closed and discuss on other pull requests.

cheers.

openexchangerates / accounting.js

improved unformat(), formatNumber() and created tests #8