Open Kache opened 1 year ago
Any PR to resolve this should also include a test case which we expect to fail with the data from this issue, so we can guard against regressions.
I think the fix for this is actually pretty simple.
Mechanisms like sort -V
and ls
usage is already banned in the repo.
The solution I would propose is moving sort_versions
from
into lib/utils.bash
.
And then using sort_versions
in lib/command-list.bash
here:
like so:
-versions=$(list_installed_versions "$plugin_name")
+versions=$(list_installed_versions "$plugin_name" | sort_versions)
I am AFK at the moment so cannot PR this myself. Whoever gets to it, the PR must include test cases for what we expect to succeed and what we expect to fail.
Whatever solution we implement here, won't order the more complex versions from plugins like Java, but the current solution is just a glob pattern, so I guess this wouldn't technically be worse 🤔
Java plugin versions:
...
sapmachine-musl-19.0.1-beta
semeru-jre-openj9-8u302-b08_openj9-0.27.0
...
trava-11.0.15+1
zulu-6.2.0.9
...
I would alternatively suggest moving away from sort_versions()
in its current incarnation, and simplifying it as much as possible by utilizing the built-in features of tools. See my PR here for asdf-plugin-template, and issue comment here for a discussion on why I think this is advantageous.
In sort_versions()
specifically, there are three piped commands, and that function is being piped into from another command. The following were done on a base-model Macbook Air M1, with zsh-5.8.1.
# First downloaded tagged refs to avoid network bias
❯ git ls-remote --tags --refs "https://github.com/python/cpython" > python_versions.txt
# And again, but using git ls-remote's built-in sort
❯ git -c 'versionsort.suffix=a' -c 'versionsort.suffix=b' -c 'versionsort.suffix=r' -c 'versionsort.suffix=p' -c 'versionsort.suffix=-' -c 'versionsort.suffix=_' ls-remote --exit-code --tags --ref --sort="version:refname" "https://github.com/python/cpython" > python_sorted_versions.txt
# Gather times for both, dumping stdout to /dev/null, and running 100 times for each
❯ for i in {0..99}; do (time cat python_versions.txt | sed 'h; s/[+-]/./g; s/.p\([[:digit:]]\)/.z\1/; s/$/.z/; G; s/\n/ /' | LC_ALL=C sort -t. -k 1,1 -k 2,2n -k 3,3n -k 4,4n -k 5,5n | awk '{print $2}' >/dev/null) 2>> python_sort.txt; done
❯ for i in {0..99}; do (time cat python_sorted_versions.txt | awk -F'[/v]' '$NF ~ /^[0-9]+.*/ { print $NF }' >/dev/null) 2>> python_ls_remote.txt; done
# The result looks like this
❯ head python_sort.txt
cat python_versions.txt 0.00s user 0.00s system 71% cpu 0.001 total
sed 'h; s/[+-]/./g; s/.p\([[:digit:]]\)/.z\1/; s/$/.z/; G; s/\n/ /' 0.00s user 0.00s system 76% cpu 0.002 total
LC_ALL=C sort -t. -k 1,1 -k 2,2n -k 3,3n -k 4,4n -k 5,5n 0.00s user 0.00s system 70% cpu 0.002 total
awk '{print $2}' > /dev/null 0.00s user 0.00s system 64% cpu 0.006 total
...
❯ head python_ls_remote.txt
cat python_sorted_versions.txt 0.00s user 0.00s system 68% cpu 0.003 total
awk -F'[/v]' '$NF ~ /^[0-9]+.*/ { print $NF }' > /dev/null 0.00s user 0.00s system 74% cpu 0.01
...
# So four lines and two lines, respectively, need to have their 2nd-to-last column summed (total time AKA wall time)
❯ awk '{ s += $(NF-1) } NR % 4 == 0 { print s; s = 0 }' python_sort.txt >> python_total_sort.txt
❯ awk '{ s += $(NF-1) } NR % 2 == 0 { print s; s = 0 }' python_ls_remote.txt >> python_total_ls_remote.txt
# Finally, average them both and print
❯ awk '{ s += $(NF-1) } END { print s / NR }' python_total_sort.txt
0.01012
❯ awk '{ s += $(NF-1) } END { print s / NR }' python_total_ls_remote.txt
0.00618
The latter is ~40% faster. While these times are small, this kind of heavily piped code exists all over asdf, and they add up. Additionally, this is on a fairly zippy machine. On a much slower Debian server with spinning disks, I get this:
❯ awk '{ s += $(NF-1) } END { print s / NR }' python_total_sort.txt
0.01842
❯ awk '{ s += $(NF-1) } END { print s / NR }' python_total_ls_remote.txt
0.00851
That is over twice as fast as before, with the first taking nearly 20 milliseconds.
As far as the actual output as it relates to this question, here is a sample showing that it correctly sorts and differentiates between alpha, beta, release candidate, and final versions:
❯ awk -F'[/v]' '$NF ~ /^[0-9]+.*/ { print $NF }' <<< $(cat python_sorted_versions.txt) | grep -A14 3.9.16
3.9.16
3.10.0a1
3.10.0a2
3.10.0a3
3.10.0a4
3.10.0a5
3.10.0a6
3.10.0a7
3.10.0b1
3.10.0b2
3.10.0b3
3.10.0b4
3.10.0rc1
3.10.0rc2
3.10.0
For the list-all
command we intentionally left version ordering in the output up to each plugin's bin/list-all
callback. No sorting is done in asdf core because not all versions can be sorted by semantic version. For example, Erlang versions (which changed versioning schemes a while back), and versions of Java (which mix version and JVM type).
I'm not certain what we should do in this case with the list-all
command. One solution would be to just sort them in asdf core as best we can (certainly no worse than what we have now). Or we could add a version sorting callback to plugins, and then pass the list of installed versions for the plugin to sort and return. Obviously this would be more hassle than what we have now, and perhaps not worth the hassle.
To recap:
asdf list
doesn't currently do any sorting, ordering depends on Bash globbingasdf list all <tool>
is deferred to <tool>/bin/list-all
commandsort_versions
as suggested in this Issue by me is currently only used in the asdf update
command. We probably don't want to increase it's usage.After mulling this over for a few days I was going to suggest a optional (mandatory? 🤔) plugin bin/sort_versions
.
@stephanGarland asdf list
doesn't currently hit the network, so we can't really use the ls-remote
built-in --sort
for this specific issue. The current asdf-plugin-template
repo relies on some sort-versions
function, which we could replace as suggested by @stephanGarland (I am looking into that PR!).
I see these as our options:
asdf list
unsorted and close this issue as a no-fix
👎 asdf list
through a local sorting method (sort_versions
) 🤷 bin/sort_versions
and do nothing as current. Then update asdf-plugin-template
to start adoption (with default implementation as ls-remote --sort suggested by @stephanGarland) 👍 👍 👍 My preference would be 3.
Ah, gotcha. I'm still trying to work through the various parts of asdf and see where various functions are being called / used. I'm fine with any of options 2-4.
As an aside, lest it be buried in an old issue, I made this comment, with a proposed patch (no PR). It speeds up asdf list
and asdf current
by about 30% and 20%, respectively. It also causes a test to fail (test/where_command.bats
), but I think it's a false positive based on other manual testing.
As an aside, lest it be buried in an old issue, I made this comment, with a proposed patch (no PR). It speeds up
asdf list
andasdf current
by about 30% and 20%, respectively. It also causes a test to fail (test/where_command.bats
), but I think it's a false positive based on other manual testing.
I did see that post. Haven't run the patch myself yet. We're happy to receive patches for our broken tests! :D We're also happy to receive performance improvements, though, as you can see from my comment in that thread (immediately above yours), I do want to get performance testing into our CI pipeline to start capturing, tracking and publishing our perf data so it can be replicated and improved. So, happy to hear your thoughts on that in the other thread. Thanks for helping out!
From your list @jthegedus , my preference would be 3. If we add support for that to asdf core plugins that need it could implement and start using it immediately after the next release of asdf.
Great! A consensus on 3.
add opt-in plugin callback bin/sort_versions and do nothing as current. Then update asdf-plugin-template to start adoption (with default implementation as ls-remote --sort suggested by @stephanGarland) 👍 👍 👍
I will introduce this after I have completed #1445 plugin docs refactoring.
Is your feature request related to a problem? Please describe
asdf list
,asdf list python
, etc output version numbers that are lexicographically sorted, e.g.:But it'd be nicer to see them version-sorted (example below)
Describe the proposed solution
Would like to see this:
Describe similar
asdf
features and why they are not sufficientn/a
Describe other workarounds you've considered
piping to
sort -V
kinda works, but the*
messes it up