Comparative benchmarks - Githubissues

aleclarson commented 3 months ago

Although performance isn't the only way we're competing with Lodash, it'd be great to have perf comparisons with Lodash and other similar libraries wherever we cover the same use cases.

We don't need to compare ourselves with FP libraries, since we're not actually competing with them. We don't need to compare with Underscore, since it's legacy at this point. Any compared libraries should have 1K+ stars on Github (maybe more).

MarlonPassos-git commented 3 months ago

Do you think about comparing specific functions, like in the clone function documentation having a category that compares with clone functions from other libs or do you think about something more general? Another thing would be a .md file, inside readme or are you thinking about a documentation page?

aleclarson commented 3 months ago

@MarlonPassos-git The idea you're describing might deserve its own discussion, but I am referring to performance benchmarks. That means adding bench calls to our benchmarks for competing libraries with lookalike functions.

crishoj commented 3 months ago

@aleclarson Is it something like this you have in mind?

(Apologies for the unflattering example 😅)

https://gist.github.com/crishoj/a6396844f88e212e911893b49b5c54de

aleclarson commented 3 months ago

Hey @crishoj, thinking about it now, I would prefer all of the comparative benchmarks to be kept in one module, rather than in the function-specific benchmark files (which are intended to detect perf regressions, so they will run on any PR that modifies that particular function).

Also, I wonder if we shouldn't leave Radash out of the comparisons, since it's unmaintained. 🤔

MarlonPassos-git commented 3 months ago

For me, it really doesn't make sense to keep bentmarks for libs that are not being maintained(radash or Underscore). The ones that come to mind are:

ramda
lodash
immutable-js (have some similar functions)

crishoj commented 3 months ago

Perhaps the bench helper could have an option to choose whether to run comparative implementations or only radashi.

Perf tracking in CI would be great. Something along the lines of https://github.com/benchmark-action/github-action-benchmark

aleclarson commented 3 months ago

ramda

lodash

immutable-js (have some similar functions)

The design philosophies of Ramda and Immutable are too different to warrant a comparison, I think. Might be worth comparing to es-toolkit though.

Perhaps the bench helper could have an option to choose whether to run comparative implementations or only radashi.

That might require more effort than it's worth. 🤔 Also, if we did put comparative benchmarks in the same file as normal benchmarks, we'd have to not assume that lodash et al are installed, because the template repository I'm working on doesn't have them as dependencies, which means copying comparative benchmarks into “your own Radashi” would be troublesome. (Note: I'll be writing a post about the template repository soon)

aleclarson commented 3 months ago

Perf tracking in CI would be great. Something along the lines of benchmark-action/github-action-benchmark

We have a GitHub workflow (see here) using Codspeed's Vitest plugin, but it's currently disabled until I have time to fix the formatting issues. It seems when Codspeed comments on a PR, its performance report doesn't include the describe("sum", …) description, making it hard to read. I'll be experimenting with this sometime next week, I think.

aleclarson commented 1 month ago

Quick update:

We now have benchmark reports for PRs (see here for an example). It's a hand-rolled solution using Vitest benchmarking and GitHub actions (the script used is here).

And here's my proposal for comparative benchmarks:

Add a comparisons folder, containing a package.json (with only lodash for now) and a single comparisons.bench.ts file, containing all comparable functionality.
My reason to keep all comparison benchmarks in one file is that benchmarks are not a lot of code, so there's nothing(?) to gain from spreading them out in this case. It'll provide a nice overview of the overlap between Radashi and Lodash (and eventually, other similar libraries).
Another reason to keep them in one file: We're not limited to strict one-to-one comparisons. For example, we could compare how to accomplish a complex task with both libraries, even if that involves multiple functions.
Update the pnpm bench command to use a custom shell script that checks if comparisons/node_modules exists (which are not installed by default when running pnpm install from the project root), and if the dependencies are installed, include the comparative benchmarks by passing them into vitest bench along with benchmarks/**/*.
The comparisons folder can have Markdown files in it as well, comparing Radashi functions to other alternatives, like @MarlonPassos-git suggested in this comment.

If anyone wants to tackle this, let me know or just assign yourself to this issue. Preferably, leave some time for feedback from others in the community, in case anyone has objections or ideas for improvement.

MarlonPassos-git commented 1 month ago

ok I can work on that, before that I would like to clarify what would be the standard?

For all the benchmarks that we have nowadays, for radashi to have a comparative version? For example, clamp only have one breach, so it would be something like this:

import * as lodash from 'lodash'
import * as radashi from 'radashi'

const comparativeLibs = [
  { name: 'radashi', lib: radashi },
  { name: 'lodash', lib: lodash }
]

describe.each(comparativeLibs)('function clamp in the library: $name', ({lib}) => {
    bench('with no arguments', async () => {
      lib.clamp(100, 0, 10)
      lib.clamp(0, 10, 100)
      lib.clamp(5, 0, 10)
    })
})

but the function max has two benches, following the logic then it would be something like this:

import * as lodash from 'lodash'
import * as radashi from 'radashi'

const comparativeLibs = [
  { name: 'radashi', lib: radashi },
  { name: 'lodash', lib: lodash }
]

describe.each(comparativeLibs)('function max in the library: $name', ({lib}) => {
  bench('with list of numbers', () => {
    const list = [5, 5, 10, 2]
    lib.max(list)
  })

  bench('with list of objects', () => {
    const list = [
      { game: 'a', score: 100 },
      { game: 'b', score: 200 },
      { game: 'c', score: 300 },
      { game: 'd', score: 400 },
      { game: 'e', score: 500 },
    ]
    lib.max(list, x => x.score)
  })
})

or instead of repeating the describe we can create multiple benches for each lib. This way the output shows which lib is faster:

import * as lodash from 'lodash'
import * as radashi from 'radashi'

const comparativeLibs = [
  { name: 'radashi', lib: radashi },
  { name: 'lodash', lib: lodash }
]

describe("clamp", () => {
    for (const {name, lib} of comparativeLibs) {
        bench(`${name}: with no arguments`, () => {
          lib.clamp(100, 0, 10)
          lib.clamp(0, 10, 100)
          lib.clamp(5, 0, 10)
        })
    }
})

import * as lodash from 'lodash'
import * as radashi from 'radashi'

const comparativeLibs = [
  { name: 'radashi', lib: radashi },
  { name: 'lodash', lib: lodash }
]

describe("max", () => {
    for (const {name, lib} of comparativeLibs) {
        bench(`${name}: with list of numbers`, () => {
            const list = [5, 5, 10, 2]
            lib.max(list)
        })

        bench(`${name}: with list of objects`, () => {
            const list = [
                { game: 'a', score: 100 },
                { game: 'b', score: 200 },
                { game: 'c', score: 300 },
                { game: 'd', score: 400 },
                { game: 'e', score: 500 },
            ]
            lib.max(list, x => x.score)
        })
    }
})

aleclarson commented 1 month ago

@MarlonPassos-git I think we'll want one describe per scenario, so the "BENCH Summary" report is only comparing identical scenarios.

~I'm not sure about doing the for loop, since Lodash (and eventually other libraries) may have differing APIs and I'd rather use them directly instead of needing a compatibility shim. Of course, that means more manual work, which sucks, but it doesn't seem avoidable?~

^ Nevermind on all that. I think a basic == comparison with lodash should suffice for cases where the APIs are different. See the dash example below.

So, to be clear, I'm thinking something like this:

const libs = {radashi, lodash} as const

type Library = typeof libs[keyof typeof libs]
type Benchmark = (_: Library) => void

const benchmarks: Record<keyof radashi, Benchmark | Record<string, Benchmark>> = {
  dash: _ => {
    const input = 'TestString123 with_MIXED_CASES, special!@#$%^&*()Characters, and numbers456'
    if (_ == lodash) {
      _.kebabCase(input)
    } else {
      _.dash(input)
    }
  },
  max: {
    'with numbers': _ => {
      const list = [5, 5, 10, 2]
      _.max(list)
    },
    'with objects': _ => {
      const list = [
        { game: 'a', score: 100 },
        { game: 'b', score: 200 },
        { game: 'c', score: 300 },
        { game: 'd', score: 400 },
        { game: 'e', score: 500 },
      ]
      _.max(list, x => x.score)
    }
  },
}

for (const [funcName, run] of Object.entries(benchmarks)) {
  describe(funcName, () => {
    if (isObject(run)) {
      const tests = Object.entries(run)
      for (const [testName, run] of tests) {
        for (const [libName, lib] of Object.entries(libs)) {
          bench(`${libName}: ${testName}`, () => run(lib))
        }
      }
    } else {
      for (const [libName, lib] of Object.entries(libs)) {
        bench(libName, () => run(lib))
      }
    }
  })
}

Also, I think we could hoist the test values with basic labels like listOfNumbers or weirdString so they can be reused between scenarios.

aleclarson commented 1 month ago

@MarlonPassos-git You'll probably find this useful: https://gist.github.com/aleclarson/a7198339c0a68991cb6c94cf9d60fa29. It's the Lodash comparison data I've collected so far.

radashi-org / radashi

Comparative benchmarks #130