Closed ajax333221 closed 12 years ago
Although you're passing the sorted flag for an array that's not sorted, there's clearly a problem, as even _.uniq(arr.sort(), true) produces the wrong result.
so does the following:
> var arr=[];
arr[2]=null;
arr[3]=null;
arr[8]=2;
arr[10]=2;
arr[12]="2";
arr[11]=5;
arr[14]=5;
arr[16]=8;
arr[19]=8;
arr[26]=undefined;
arr[29]=undefined;
arr[33]="hi";
> _.uniq(arr.sort(), true)
["2", 5, 8, "hi", null]
this seems to be because it's using the wrong comparison operator on line 377
@davidarkemp: According to section 15.4.4.11 of the ES 5.1 spec, the sorting behavior for sparse arrays is implementation-dependent, so it's quite possible (and permissible) that arr.sort()
will produce inconsistent results. Furthermore, IE <= 8 treats undefined
elements in arrays as elisions; as such, 26 in arr == false && 29 in arr == false
. It's as if those values were never initialized: try executing _.uniq(arr)
in IE <= 8 without passing the isSorted
flag, and you'll see that the result is [null, 2, 5, 8, "hi"]
; it should be [null, 2, 5, 8, undefined, "hi"]
.
@ajax333221: null
is not included in the results because the sorted _.uniq
algorithm does not expect to be invoked on sparse arrays. Let's walk through the first five steps of what will occur after _.uniq(arr, true)
is called:
memo
array (via _.reduce
) to store the previously-enumerated values, and an empty result
array to hold the unique values.0 == i
is false, as arr
is sparse and does not contain a value at i == 0
. Continue the reduction.arr
is initialized at i == 2
(hence, 0 == i
is false).isSorted === true
, so the comparison _.last(memo) != el
is performed. You'd expect the result of this comparison to be true
; it's actually false
. memo
is empty, so _.last(memo) == undefined
, but arr[2] == null
. undefined == null
, so undefined != null
is false.Result(4)
is false
, neither memo
nor result
are updated.This is why result
will be [2, 5, 8, undefined, "hi"]
in all browsers except IE <= 8. The latter is subject to the sparse array bug that I mentioned above, so the (egregiously incorrect) result will be [2, 5, 8, "hi"]
.
The solution, unfortunately, is to avoid sparse arrays and arrays containing undefined
values entirely...even if you don't pass the isSorted
flag (_.uniq(arr)
), you'll still receive inconsistent results in IE.
I have created my own delDups, and it is notably faster than the underscore uniq()
http://jsperf.com/undersc-uniq
And best of all, it works like I expect...
I think they should consider rebuilding their function.
I am a newbie at JS, I have less than a year using it. So I am sure my function can be improved even more:
function delDups(arr,bol){
var i=0,ien=arr.length,prv,rtn=[];
if(ien>0){
if(ien<3||bol){
for(;i<ien;i++){
if(i in arr){
if(arr[i]!==prv||rtn.length==0){
rtn.push(arr[i]);
prv=arr[i];
}
}
}
}else{
for(;i<ien;i++){
if(i in arr&&!inArray(arr[i],rtn,true)){
rtn.push(arr[i]);
}
}
}
}
return rtn;
}
The older function I had was this:
function delDups(arr,bol){
var i=0,ien=arr.length,rtn=[];
if(ien>0){
if(ien<3||bol){
for(ien--;i<ien;i++){
if(arr[i]!==arr[i+1]){
rtn.push(arr[i]);
}
}
rtn.push(arr[i]);
}else{
rtn.push(arr[i]);
for(i=1;i<ien;i++){
if(!inArray(arr[i],rtn,true)){
rtn.push(arr[i]);
}
}
}
}
return rtn;
}
The second is like 1/3 faster, but it will fail on sparse arrays.
So you guys are free to use my functions, I am donating them to underscore because I love underscore
PS: if(!inArray(arr[i],rtn,true)) is exactly the same as rtn.indexOf(arr[i])==-1
Okay, I've submitted a pull request to correct the bug. I suspect that one of the reasons that _.uniq()
is so slow is because it supports an optional iterator
function, and effectively must keep track of two lists: one with the values modified by the iterator, and the other with the results from the original.
Incidentally, you can make your delDups
function even more concise, if you want:
function delDups(arr, bol) {
var i = 0, len = arr.length >>> 0, prv, rtn = [];
if (len) {
for (; i < len; i++) {
if (i in arr && ((len < 3 || bol ? arr[i] !== prv || !rtn.length : !inArray(arr[i], rtn, true))) {
rtn.push(prv = arr[i]);
}
}
}
return rtn;
}
@kitcambridge it looks very neat.
@ajax333221 Whoops! Sorry about that; nice catch!
@kitcambridge The appropriate response is:
@jdalton Hangs head in shame.
Whoops -- wrong ticket number.
Merged the fix, though, earlier.
Problem?: The first 'null' value from the beginning was removed
Notes: This also happens if you start with 'undefined' values (and probably more)