pbeshai / tidy

Tidy up your data with JavaScript, inspired by dplyr and the tidyverse
https://pbeshai.github.io/tidy
MIT License
725 stars 21 forks source link

New summarizer: nWhere? #60

Closed mhkeller closed 2 years ago

mhkeller commented 2 years ago

Would you be interested in a PR adding a new summarizer that I've found helpful. It's the same as n but allows you to add a condition and only count items that meet that condition. So if you want to know how many elements in your groupBy had a certain value you can get that.

Simple JS implementation (Would convert to TypeScript)

export default function nWhere(conditional) {
  return function nWhereFn (list) {
    return list.filter(conditional).length;
  }
}

Usage

const data = [
  { str: 'foo', value: 3 },
  { str: 'foo', value: 1 },
  { str: 'bar', value: 3 },
  { str: 'bar', value: 1 },
  { str: 'bar', value: 7 },
];

tidy(data, summarize({
  foos: nWhere(d => d.str === 'foo'),
  bars: nWhere(d => d.str === 'bar'),
  count: n(),
})
// output:
[{ foos: 2, bars: 3, count: 5 }]
pbeshai commented 2 years ago

I too have made this helper myself! but called it countIf. nWhere is nicer I think, doesn't conflict with the summarizeIf style of things... my implementations avoided creating another array by using sum + predicate, more SQL style in a way. I'd be happy to add these as nWhere and sumWhere if you're up for it!


export function countIf<T extends object>(predicate: (d: T) => boolean) {
  return (items: T[]) => sum((d: T) => (predicate(d) ? 1 : 0))(items)
}

export function sumIf<T extends object>(
  predicate: (d: T) => boolean,
  key: keyof T | ((d: T) => number)
) {
  const keyFn =
    typeof key === 'function' ? key : (d: T) => d[key] as unknown as number

  return (items: T[]) => sum((d: T) => (predicate(d) ? keyFn(d) : 0))(items)
}
mhkeller commented 2 years ago

Sounds good – go for it!

pbeshai commented 2 years ago

I actually ended up doing this as an option on sum and n now which matches a couple other places, available in v2.5.0.

const data = [
  { str: 'foo', value: 3 },
  { str: 'foo', value: 1 },
  { str: 'bar', value: 3 },
  { str: 'bar', value: 1 },
  { str: 'bar', value: 7 },
];

tidy(data, summarize({
  count: n(),
  countFoo: n({ predicate: d => d.str === 'foo' })
})
// output:
[{ count: 5, countFoo: 2 }]