denoland / std

The Deno Standard Library
https://jsr.io/@std
MIT License
3.24k stars 622 forks source link

cleanup semver #3948

Closed timreichen closed 5 months ago

timreichen commented 11 months ago

Background

There are several inconsistencies in the semver mod:

File names and exported functions

The semver mod is not well structured. The file names and functions are not consistent. range_format.ts and comparator_format.ts uses [*_verb].ts and *Format(), but parse_range.ts and parse_comparator.ts uses [verb_*].ts and parse*(). Another inconsistency is: parse_comparator.ts and parse_range.ts but is_semver_comparator.ts and is_semver_range.ts. The _semver in the name seems obsolete.

eq, neq, gt, gte, lt, lte

These functuons are wrapper functions for compare() compare the result to 0 with different comparators: return compare(s0, s1) [===|!==|>|>=|<|<=] 0. These seem too trivial.

rcompare

This function calls compare() with switched arguments. This seems too trivial.

Proposal

File names and exported functions

Rename [*_verb].ts to [verb_*].ts and *Format() to format*() for all filenames and functions and deprecate old functions and remove _semver from file names and function:

eq, neq, gt, gte, lt, lte

@iuioiua @kt3k WDYT?

iuioiua commented 11 months ago
  • range_format.ts => format_range.ts
  • rangeFormat() => formatRange()
  • comparator_format.ts => format_comparator.ts
  • comparatorFormat() => formatComparator()

I mostly agree. But I think these functions should probably have a stringify prefix instead. The strings returned in other functions with "format" are usually configurable. WDYT?

  • is_semver_comparator.ts => is_comparator.ts
  • isSemVerComparator() => isComparator()
  • is_semver_range.ts => is_range.ts
  • isSemVerRange() => isRange()

I agree.

eq, neq, gt, gte, lt, lte

deprecate all these functions

I weakly agree. However, doing so would require the dev to learn these comparison concepts. Happy to hear other opinions.

rcompare

deprecate this function

I agree.

timreichen commented 11 months ago

I mostly agree. But I think these functions should probably have a stringify prefix instead. The strings returned in other functions with "format" are usually configurable. WDYT?

I agree but the current format() implementation in format.ts is kinda configurable with where on can pass FormatStyle. Maybe this should also be renamed to stringify and instead of passing a style arg, just stringify what is passed? format(semver, "primary") => stringify({major: 1, minor: 0, patch: 0}) format(semver, "build") => stringify({ build: ["x", "y", "z" }) etc.

eq, neq, gt, gte, lt, lte

deprecate all these functions

I weakly agree. However, doing so would require the dev to learn these comparison concepts. Happy to hear other opinions.

I agree to gather more opinions. For most users this might already be known concept as in localeCompare or Intl.Collator compare.

timreichen commented 11 months ago

@iuioiua I have a question about rangeMin, rangeMax, comparatorMin, comparatorMax: Their name also don't follow the naming convention but is also not descriptive on what they actually do. I think we should also rename them. Do you have an idea to what? Maybe something like minForRange?

timreichen commented 11 months ago

outside as a function name seems weird. Normally in js it is positively formulated: includes, contains etc. I think we should rename that to rangeIncludes and reverse the output boolean.

timreichen commented 11 months ago

eq, neq, gt, gte, lt, lte

deprecate all these functions

I weakly agree. However, doing so would require the dev to learn these comparison concepts. Happy to hear other opinions.

I think this would be not too hard to achieve because we can write some

/**
  * @deprecated (...)
  * use ```ts
  compare(...) > 0
  ``` instead.
  */

or similar into the deprecation note.

timreichen commented 11 months ago

testRange() is kinda a misnomer. I think that should be called inRange or something similar. Opinions?

iuioiua commented 11 months ago

@iuioiua I have a question about rangeMin, rangeMax, comparatorMin, comparatorMax: Their name also don't follow the naming convention but is also not descriptive on what they actually do. I think we should also rename them. Do you have an idea to what? Maybe something like minForRange?

These sound fine how they are to me.

outside as a function name seems weird. Normally in js it is positively formulated: includes, contains etc. I think we should rename that to rangeIncludes and reverse the output boolean.

I agree.

testRange() is kinda a misnomer. I think that should be called inRange or something similar. Opinions?

I agree.

eq, neq, gt, gte, lt, lte

deprecate all these functions

What if we had a SemVer class that implemented a well-thought-out valueOf() method such that we can do these comparison operators directly instead of using functions? Perhaps, an idea worth exploring.

Before:

import { gt, parse } from "https://deno.land/std/semver/mod.ts"

const s0 = parse(v0);
const s1 = parse(v1);
gt(s0, s1);

After:

import { SemVer } from "https://deno.land/std/semver/mod.ts"

const s0 = new SemVer(v0);
const s1 = new SemVer(v1);
s0 > s1;
timreichen commented 11 months ago

eq, neq, gt, gte, lt, lte

deprecate all these functions

What if we had a SemVer class that implemented a well-thought-out valueOf() method such that we can do these comparison operators directly instead of using functions? Perhaps, an idea worth exploring.

Before:

import { gt, parse } from "https://deno.land/std/semver/mod.ts"

const s0 = parse(v0);
const s1 = parse(v1);
gt(s0, s1);

After:

import { SemVer } from "https://deno.land/std/semver/mod.ts"

const s0 = new SemVer(v0);
const s1 = new SemVer(v1);
s0 > s1;

I am not for a SemVer class approach. This is how the implementation initially was and we moved away because a semver can be represented by a native object and we got overlaps and special case handlings for class instances as well as class instantiation overhead for the simplest things.

timreichen commented 11 months ago

@iuioiua I think it would not be a big deal to change from

eq(s0, s1)
neq(s0, s1)
gt(s0, s1)
gte(s0, s1)
lt(s0, s1)
lte(s0, s1)

to

compare(s0, s1) === 0
compare(s0, s1) !== 0
compare(s0, s1) > 0
compare(s0, s1) >= 0
compare(s0, s1) < 0
compare(s0, s1) <= 0

with enough time to switch.

How should we go about it?

iuioiua commented 11 months ago

@iuioiua I think it would not be a big deal to change from

eq(s0, s1)
neq(s0, s1)
gt(s0, s1)
gte(s0, s1)
lt(s0, s1)
lte(s0, s1)

to

compare(s0, s1) === 0
compare(s0, s1) !== 0
compare(s0, s1) > 0
compare(s0, s1) >= 0
compare(s0, s1) < 0
compare(s0, s1) <= 0

with enough time to switch.

How should we go about it?

I feel two ways about this. On one hand, I like a smaller, simpler API. It'd reduce engineering costs. Also, this makes comparisons more "organic" and closer to comparing numbers, which I like. However, you can argue that this might be a slightly poorer DX. I'd be keen to hear other's opinions.

timreichen commented 11 months ago

I feel two ways about this. On one hand, I like a smaller, simpler API. It'd reduce engineering costs. Also, this makes comparisons more "organic" and closer to comparing numbers, which I like. However, you can argue that this might be a slightly poorer DX. I'd be keen to hear other's opinions.

I posted this issue on discord for more opinions.

In addition, removing them would also solve the issue of their current names, since std functions should not use abbreviations and acronyms if possible. In std we have multiple other mods who use equal, not eq (assertEquals, assertAlmostEquals for example). Same with neq.

marvinhagemeister commented 11 months ago

I like the generic compare function because it would allow us to use it for Array.prototype.sort() as is.

iuioiua commented 11 months ago

@marvinhagemeister, do you think compare() is good enough on its own to justify removing eq() and others?

I'm leaning more in favour of this change now. I do still understand those that want to use eq(), etc.

marvinhagemeister commented 11 months ago

I'd add the compare function and base the other's internally on that. I think there isn't a reason enough to do a breaking API change.

timreichen commented 11 months ago

@marvinhagemeister that is exactly what we have now: We have compare and based the other functions on that. The question is more about if these functions should exist at all because the are one-liner abstractions and sorta bloat the api. It would not be a breaking change, if anything we would deprecate them first with a deprecation note on how to replace them with compare().

jsejcksn commented 11 months ago

On the dedicated comparator functions:

eq, neq, gt, gte, lt, lte

These functuons are wrapper functions for compare() compare the result to 0 with different comparators: return compare(s0, s1) [===|!==|>|>=|<|<=] 0. These seem too trivial. … deprecate all these functions

I agree with some of the things that were already posted:

🔗 you can argue that this might be a slightly poorer DX

🔗 doing so would require the dev to learn these comparison concepts.

Yet I also agree that the functions might not belong as top-level exports.

I think if they are removed, that examples should be added to the compare function's documentation for each existing comparison case:

 /**
  * Compare two semantic version objects.
  *
  * Returns `0` if `v1 === v2`, or `1` if `v1` is greater, or `-1` if `v2` is
  * greater.
  *
  * Sorts in ascending order if passed to `Array.sort()`.
+ *
+ * The number returned by `compare` can then be compared to `0` in order to
+ * determine some common cases:
+ * - `compare(a, b) === 0` a equal to b
+ * - `compare(a, b) !== 0` a not equal to b
+ * - `compare(a, b) < 0`   a less than b
+ * - `compare(a, b) <= 0`  a less than or equal to b
+ * - `compare(a, b) > 0`   a greater than b
+ * - `compare(a, b) >= 0`  a greater than or equal to b
  */
 export function compare(
   s0: SemVer,
   s1: SemVer,
 ): 1 | 0 | -1 {

Aside: After looking at the documentation, I also think that using the convention of a and b as parameter names will lead to less confusion than numbered parameter names (e.g. v1/v2 or s0/s1) because of the already-existing context of numbers and comparison.

If we decide to keep them, I wonder if they could become methods on the compare function instead of top-level exports. For example:

export const compare:
  & ((a: SemVer, b: SemVer) => 1 | 0 | -1)
  & Record<
    | "eq" // or "equal", etc.
    | "gt" // or "greaterThan", etc.
    | "gte" // or "greaterThanOrEqual", etc.
    | "lt" // or "lessThan", etc.
    | "lte" // or "lessThanOrEqual", etc.
    | "neq", // or "notEqual", etc.
    (a: SemVer, b: SemVer) => boolean
  > = …
const a = { major: 2 } as SemVer;
const b = { major: 1 } as SemVer;
assert(compare(a, b) === 1);
assert(compare.eq(a, b) === false);
assert(compare.gt(a, b) === true);
assert(compare.gte(a, b) === true);
assert(compare.lt(a, b) === false);
assert(compare.lte(a, b) === false);
assert(compare.neq(a, b) === true);

See this TypeScript Playground example for a working demonstration.

timreichen commented 11 months ago

While I like the look of that approach, I am not sure if it is good practice to extend a function like that. It kinda circumvents the std single export per file approach. If we want to keep them, I think it would be better to rename them to their full name to align to the rest of std and web apis instead.

iuioiua commented 11 months ago

I'm 50/50 on keeping eq() and friends. Though, if we do, I agree that we should rename them to their full names.

timreichen commented 11 months ago

I had another look at the mod. There are also ltr, gtr, testRange and outside which all can be condensed into one compareRange function. I am still leaning towards deprecating eq and friends. If not, we run into the problem that we also should implement lter, gter, eqr, neqr, etc. for completeness which will bloat the mod even more.

iuioiua commented 11 months ago

Hm... Maybe. I prefer a smaller API. It'd be easier to manage and make high-quality. Low-value abstractions are not attractive to me. Happy to hear other opinions.

timreichen commented 11 months ago

@iuioiua How about we deprecate them and reintroduce them with proper names at a later point if there are good arguments to have them? I could imagine, that nobody would miss them if they are gone and we provide good examples with compare...

timreichen commented 11 months ago

Another inconsistency I found: consider

export const OPERATORS = [
  "",
  "=",
  "==",
  "===",
  "!==",
  "!=",
  ">",
  ">=",
  "<",
  "<=",
] as const;

Should we support "", "==", "===", "!=" and "!==" as comparators? the comparator regexp doesn't recognize them:

const COMPARATOR = "(?:<|>)?=?";

this means ==1.2.3 is not a valid comparator for parseComparator() but is in testComparator() aka cmp().

@iuioiua WDYT?

iuioiua commented 11 months ago

@iuioiua How about we deprecate them and reintroduce them with proper names at a later point if there are good arguments to have them? I could imagine, that nobody would miss them if they are gone and we provide good examples with compare...

I'm cool with that. It'd be another step away from node-semver's API, but IMO, that's a good thing. Yoshiya's yet to comment on the idea, so no guarantee it'll go through.

Another inconsistency I found: consider

export const OPERATORS = [
  "",
  "=",
  "==",
  "===",
  "!==",
  "!=",
  ">",
  ">=",
  "<",
  "<=",
] as const;

Should we support "", "==", "===", "!=" and "!==" as comparators? the comparator regexp doesn't recognize them:

const COMPARATOR = "(?:<|>)?=?";

this means ==1.2.3 is not a valid comparator for parseComparator() but is in testComparator() aka cmp().

@iuioiua WDYT?

Interesting. I think these operators are included for "completeness" (see cmp() description here). IMO, these added operators just add confusion. I like the idea of getting rid of them.

It's funny. The more we dive into the semver API, the more API bloat we find.

timreichen commented 11 months ago

Note: != is an shorthand for < and >: !=1.0.0 = <1.0.0 && >1.0.0

This would also make it easier to define, what exactly a Comparator, a Range and a RangeSet is:


interface SemVer {
  major: number;
  minor: number;
  patch: number;
  prerelease?: (string | number)[];
  build?: string[];
}

/**
 * a SemVer with an operator
 * @example "<1.2.3" => { operator: "<", major: 1, minor: 2, patch: 3 }
 */
interface Comparator extends SemVer {
  operator: "<" | "<=" | ">" | ">=";
}

/**
 * A Range is either one or two Comparators, describing a minimum and/or maximum version.
 * @example "<2.0.0" => [ { operator: "<", major: 2, minor: 0, patch: 0 } ]
 * @example ">=1.0.0 <2.0.0" => [ { operator: ">=", major: 1, minor: 0, patch: 0 }, { operator: "<", major: 2, minor: 0, patch: 0 } ]
 */
type Range = [Comparator] | [Comparator, Comparator];

/**
 * RangeSet are multiple ranges combined with OR.
 * @example ">=1.0.0 <2.0.0 || >3.0.0" => [ [ { operator: ">=", major: 1, minor: 0, patch: 0 }, { operator: "<", major: 2, minor: 0, patch: 0 } ], [ { operator: ">", major: 3, minor: 0, patch: 0 } ], ]
 */
type RangeSet = Range[];

RangeSet is what is called as Range in the current implementation. See https://github.com/denoland/deno_std/issues/3967

timreichen commented 11 months ago

Another thing: compareBuild() is public, parsePrerelease() is private in _shared.ts, as is parseBuild(). Is there any reason why compareBuild() is public?

Edit: compareBuild() seems to be a misnomer. It doesn't compare build alone, but SemVers with buildmetadata. I think this ought to be a option in compare() rather than a function.

iuioiua commented 11 months ago

Another thing: compareBuild() is public, parsePrerelease() is private in _shared.ts, as is parseBuild(). Is there any reason why compareBuild() is public?

Edit: compareBuild() seems to be a misnomer. It doesn't compare build alone, but SemVers with buildmetadata. I think this ought to be a option in compare() rather than a function.

Happy to have a look at a PR that does that.

timreichen commented 11 months ago

Happy to have a look at a PR that does that.

https://github.com/denoland/deno_std/pull/4065

kt3k commented 11 months ago

compare performs standard comparison. compareBuild performs non standard comparison, including build metadata. I think it's clearer to have a separate API for non standard one.

timreichen commented 11 months ago

compare performs standard comparison. compareBuild performs non standard comparison, including build metadata. I think it's clearer to have a separate API for non standard one.

Imo the name compareBuild implies comparing two build properties exclusively, not the whole semver with build. Else it would be called compareWithBuild. compare and compareBuild code is almost identical. Don't you think we can solve that with an option? We even can make the function more general and allow comparisons, excluding prerelease and build:

compare(s0, s1) // compares including prerelease
compare(s0, s1, { matcher: "prerelease" }) // same as default
compare(s0, s1, { matcher: "build" }) // compare including prerelease and build
compare(s0, s1, { matcher: "version" }) // compare excluding prerelease and build
kt3k commented 11 months ago

I don't like matcher: "version" idea either. Semver spec only defines one ordering and matcher: "version" is non standard interpretation of semver order.

Semver spec also says the below:

Build metadata MUST be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence. https://semver.org/#spec-item-10

I now feel that we should just deprecate compareBuild because that is something explicitly discouraged by the spec.

timreichen commented 11 months ago

I don't like matcher: "version" idea either. Semver spec only defines one ordering and matcher: "version" is non standard interpretation of semver order.

Semver spec also says the below:

Build metadata MUST be ignored when determining version precedence. Thus two versions that differ only in the build metadata, have the same precedence. https://semver.org/#spec-item-10

I now feel that we should just deprecate compareBuild because that is something explicitly discouraged by the spec.

I think that would be a good solution. WDYT @iuioiua ?

iuioiua commented 11 months ago

I agree. However, let's keep in mind that some devs may use compareBuild(). I doubt there are, though.

timreichen commented 11 months ago

A weird thing I encountered in outside.ts: The description says: Returns true if the version is outside the bounds of the range in either the high or low direction.

So imagine the following:

This returns false although the version is outside the range but in between:

So I wonder if that is a bug and why one would not just do lt(version, rangeMin(range)) or gt(version, rangeMax(range)). That seems clearer to me. So the whole function could be reduced to

 switch (hilo) {
    case ">":
      return gt(version, rangeMax(range));
    case "<":
      return lt(version, rangeMin(range));
    default:
      return gt(version, rangeMax(range)) || lt(version, rangeMin(range));
  }

which seems a weird function to have.

iuioiua commented 11 months ago

@timreichen, how does the node-semver package handle this use case?

timreichen commented 11 months ago

@timreichen, how does the node-semver package handle this use case?

import * as semver from "npm:semver";
console.log(semver.outside("2.5.0", ">= 1.0.0 < 2.0.0 || >=3.0.0 < 4.0.0", "<")); // false
console.log(semver.outside("2.5.0", ">= 1.0.0 < 2.0.0 || >=3.0.0 < 4.0.0", ">")); // false

This was directly ported, so it works as the current implementation. This seems to be a bug in my view, because 2.5.0 is not part of the range.

iuioiua commented 11 months ago

Yeah, that looks like a bug to me too. Even if npm:semver behaves in the same way.

timreichen commented 10 months ago

I looked once more over semver. reverseSort() seems to be for a very specialist use case. Do we want to keep it or remove that from the mod?

iuioiua commented 10 months ago

I looked once more over semver. reverseSort() seems to be for a very specialist use case. Do we want to keep it or remove that from the mod?

I'm for the removal. I saw it in my recent cleanups and thought the same thing.

Leokuma commented 9 months ago

All the functions take SemVer objects.

For improved ergonomics, could some (most) of them take string? Eg greaterThan(string, string).

timreichen commented 9 months ago

All the functions take SemVer objects.

For improved ergonomics, could some (most) of them take string? Eg greaterThan(string, string).

I am against this approach. these functions use semver objects internally, not strings. Let's say you use multiple function calls and pass a string each time: It would need to get parsed with every function call. One also might have already a semver object that can be passed, which you would need to format first so it then would be parsed again inside the function.

Leokuma commented 9 months ago

Let's say you use multiple function calls and pass a string each time: It would need to get parsed with every function call.

I understand that. But are there many use cases where end-users will be manipulating and passing SemVer objects around? I thought 95% of the time users would just call greaterThan() or testRange() and that's it. Maybe library authors have more complex needs?

I was wondering if internally we could use only compare(SemVer, SemVer) (to prevent unnecessary reparsing, as you mention), and expose greaterThan(string, string) and friends to users. Not sure if the signature difference would be a problem.

timreichen commented 9 months ago

Let's say you use multiple function calls and pass a string each time: It would need to get parsed with every function call.

I understand that. But are there many use cases where end-users will be manipulating and passing SemVer objects around? I thought 95% of the time users would just call greaterThan() or testRange() and that's it. Maybe library authors have more complex needs?

I think each function should do only one task, the one that is in described in the name. In that sense, adding parsing as a task to each function seems really superfluous.

I was wondering if internally we could use only compare(SemVer, SemVer) (to prevent unnecessary reparsing, as you mention), and expose greaterThan(string, string) and friends to users. Not sure if the signature difference would be a problem.

Imo it is trivial to have the user do greaterThan(parse(string), parse(string)).

jsejcksn commented 9 months ago

I fully agree with @timreichen and at the same time I think it's our responsibility to document common usage cases for inexperienced users: essentially answers to questions like "Why would I use this?", "How do I x?", etc.

Leokuma commented 9 months ago

I agree with your points from a technical perspective. But in practice, taking into account how this specific module will be used in the real world, I'm having difficulty to think of a use case where the user will have a SemVer object before having the string.

Currently I'm doing this:

greaterOrEqual(parse(Deno.version.deno), parse('1.40.5'))

It could be like this:

greaterOrEqual(Deno.version.deno, '1.40.5')

In the second case I only need to import greaterOrEqual; in the first case I have to import greaterOrEqual and parse, even though I'm using only 1 actual feature - I don't really need to parse a SemVer as I'm not manipulating it and I'm not calling any other function with the same SemVer object. The second case is arguably more readable too.

If we can't think of a use case where the user calls greaterOrEqual() without "inlining" parse() like greaterOrEqual(parse(string), parse(string)), then I think it's worth considering to have the function receive string.

The fetch API for example takes the URL as string. If users were forced to do new URL(), that would not be ideal ergonomics because most of the time the user has the URL as string.

Anyway, feel free to disagree. I'm happy just to share my points. I trust that you guys will take the best decision 👍.

timreichen commented 9 months ago

I agree with your points from a technical perspective. But in practice, taking into account how this specific module will be used in the real world, I'm having difficulty to think of a use case where the user will have a SemVer object before having the string.

Anyway, feel free to disagree. I'm happy just to share my points. I trust that you guys will take the best decision 👍.

I also see your point having convenience for simple use cases, though at the expense of a bit more abstraction and performance. std implementations should be as low to the core algorithms as possible, so your proposal seems to be a very good idea as a third party mod that is based on std imo.

timreichen commented 9 months ago

@iuioiua I wonder if it would make sense to move all comparator functions inside their counterpart range files, e.g. comparatorMin() -> range_min.ts, comparatorMax() -> range_max.ts etc. Seems like all these functions are only used in one file each and are covered by range tests.

iuioiua commented 9 months ago

@iuioiua I wonder if it would make sense to move all comparator functions inside their counterpart range files, e.g. comparatorMin() -> range_min.ts, comparatorMax() -> range_max.ts etc. Seems like all these functions are only used in one file each and are covered by range tests.

Yep. That makes sense to do.

iuioiua commented 5 months ago

It seems that everything here is complete. Thank you for all your input and diligence, @timreichen.