Open aleclarson opened 4 months ago
Do you think about comparing specific functions, like in the clone function documentation having a category that compares with clone functions from other libs or do you think about something more general? Another thing would be a .md file, inside readme or are you thinking about a documentation page?
@MarlonPassos-git The idea you're describing might deserve its own discussion, but I am referring to performance benchmarks. That means adding bench
calls to our benchmarks for competing libraries with lookalike functions.
@aleclarson Is it something like this you have in mind?
(Apologies for the unflattering example 😅)
https://gist.github.com/crishoj/a6396844f88e212e911893b49b5c54de
Hey @crishoj, thinking about it now, I would prefer all of the comparative benchmarks to be kept in one module, rather than in the function-specific benchmark files (which are intended to detect perf regressions, so they will run on any PR that modifies that particular function).
Also, I wonder if we shouldn't leave Radash out of the comparisons, since it's unmaintained. 🤔
For me, it really doesn't make sense to keep bentmarks for libs that are not being maintained(radash or Underscore). The ones that come to mind are:
Perhaps the bench helper could have an option to choose whether to run comparative implementations or only radashi.
Perf tracking in CI would be great. Something along the lines of https://github.com/benchmark-action/github-action-benchmark
ramda
lodash
immutable-js (have some similar functions)
The design philosophies of Ramda and Immutable are too different to warrant a comparison, I think. Might be worth comparing to es-toolkit though.
Perhaps the bench helper could have an option to choose whether to run comparative implementations or only radashi.
That might require more effort than it's worth. 🤔 Also, if we did put comparative benchmarks in the same file as normal benchmarks, we'd have to not assume that lodash et al are installed, because the template repository I'm working on doesn't have them as dependencies, which means copying comparative benchmarks into “your own Radashi” would be troublesome. (Note: I'll be writing a post about the template repository soon)
Perf tracking in CI would be great. Something along the lines of benchmark-action/github-action-benchmark
We have a GitHub workflow (see here) using Codspeed's Vitest plugin, but it's currently disabled until I have time to fix the formatting issues. It seems when Codspeed comments on a PR, its performance report doesn't include the describe("sum", …)
description, making it hard to read. I'll be experimenting with this sometime next week, I think.
Quick update:
And here's my proposal for comparative benchmarks:
comparisons
folder, containing a package.json
(with only lodash for now) and a single comparisons.bench.ts
file, containing all comparable functionality.pnpm bench
command to use a custom shell script that checks if comparisons/node_modules
exists (which are not installed by default when running pnpm install
from the project root), and if the dependencies are installed, include the comparative benchmarks by passing them into vitest bench
along with benchmarks/**/*
.comparisons
folder can have Markdown files in it as well, comparing Radashi functions to other alternatives, like @MarlonPassos-git suggested in this comment.If anyone wants to tackle this, let me know or just assign yourself to this issue. Preferably, leave some time for feedback from others in the community, in case anyone has objections or ideas for improvement.
ok I can work on that, before that I would like to clarify what would be the standard?
For all the benchmarks that we have nowadays, for radashi to have a comparative version? For example, clamp
only have one breach, so it would be something like this:
import * as lodash from 'lodash'
import * as radashi from 'radashi'
const comparativeLibs = [
{ name: 'radashi', lib: radashi },
{ name: 'lodash', lib: lodash }
]
describe.each(comparativeLibs)('function clamp in the library: $name', ({lib}) => {
bench('with no arguments', async () => {
lib.clamp(100, 0, 10)
lib.clamp(0, 10, 100)
lib.clamp(5, 0, 10)
})
})
but the function max
has two benches, following the logic then it would be something like this:
import * as lodash from 'lodash'
import * as radashi from 'radashi'
const comparativeLibs = [
{ name: 'radashi', lib: radashi },
{ name: 'lodash', lib: lodash }
]
describe.each(comparativeLibs)('function max in the library: $name', ({lib}) => {
bench('with list of numbers', () => {
const list = [5, 5, 10, 2]
lib.max(list)
})
bench('with list of objects', () => {
const list = [
{ game: 'a', score: 100 },
{ game: 'b', score: 200 },
{ game: 'c', score: 300 },
{ game: 'd', score: 400 },
{ game: 'e', score: 500 },
]
lib.max(list, x => x.score)
})
})
or instead of repeating the describe we can create multiple benches for each lib. This way the output shows which lib is faster:
import * as lodash from 'lodash'
import * as radashi from 'radashi'
const comparativeLibs = [
{ name: 'radashi', lib: radashi },
{ name: 'lodash', lib: lodash }
]
describe("clamp", () => {
for (const {name, lib} of comparativeLibs) {
bench(`${name}: with no arguments`, () => {
lib.clamp(100, 0, 10)
lib.clamp(0, 10, 100)
lib.clamp(5, 0, 10)
})
}
})
import * as lodash from 'lodash'
import * as radashi from 'radashi'
const comparativeLibs = [
{ name: 'radashi', lib: radashi },
{ name: 'lodash', lib: lodash }
]
describe("max", () => {
for (const {name, lib} of comparativeLibs) {
bench(`${name}: with list of numbers`, () => {
const list = [5, 5, 10, 2]
lib.max(list)
})
bench(`${name}: with list of objects`, () => {
const list = [
{ game: 'a', score: 100 },
{ game: 'b', score: 200 },
{ game: 'c', score: 300 },
{ game: 'd', score: 400 },
{ game: 'e', score: 500 },
]
lib.max(list, x => x.score)
})
}
})
@MarlonPassos-git I think we'll want one describe
per scenario, so the "BENCH Summary" report is only comparing identical scenarios.
~I'm not sure about doing the for
loop, since Lodash (and eventually other libraries) may have differing APIs and I'd rather use them directly instead of needing a compatibility shim. Of course, that means more manual work, which sucks, but it doesn't seem avoidable?~
^ Nevermind on all that. I think a basic ==
comparison with lodash
should suffice for cases where the APIs are different. See the dash
example below.
So, to be clear, I'm thinking something like this:
const libs = {radashi, lodash} as const
type Library = typeof libs[keyof typeof libs]
type Benchmark = (_: Library) => void
const benchmarks: Record<keyof radashi, Benchmark | Record<string, Benchmark>> = {
dash: _ => {
const input = 'TestString123 with_MIXED_CASES, special!@#$%^&*()Characters, and numbers456'
if (_ == lodash) {
_.kebabCase(input)
} else {
_.dash(input)
}
},
max: {
'with numbers': _ => {
const list = [5, 5, 10, 2]
_.max(list)
},
'with objects': _ => {
const list = [
{ game: 'a', score: 100 },
{ game: 'b', score: 200 },
{ game: 'c', score: 300 },
{ game: 'd', score: 400 },
{ game: 'e', score: 500 },
]
_.max(list, x => x.score)
}
},
}
for (const [funcName, run] of Object.entries(benchmarks)) {
describe(funcName, () => {
if (isObject(run)) {
const tests = Object.entries(run)
for (const [testName, run] of tests) {
for (const [libName, lib] of Object.entries(libs)) {
bench(`${libName}: ${testName}`, () => run(lib))
}
}
} else {
for (const [libName, lib] of Object.entries(libs)) {
bench(libName, () => run(lib))
}
}
})
}
Also, I think we could hoist the test values with basic labels like listOfNumbers
or weirdString
so they can be reused between scenarios.
@MarlonPassos-git You'll probably find this useful: https://gist.github.com/aleclarson/a7198339c0a68991cb6c94cf9d60fa29. It's the Lodash comparison data I've collected so far.
Although performance isn't the only way we're competing with Lodash, it'd be great to have perf comparisons with Lodash and other similar libraries wherever we cover the same use cases.
We don't need to compare ourselves with FP libraries, since we're not actually competing with them. We don't need to compare with Underscore, since it's legacy at this point. Any compared libraries should have 1K+ stars on Github (maybe more).