tldr-pages / tldr

📚 Collaborative cheatsheets for console commands
https://tldr.sh
Other
50.36k stars 4.11k forks source link

Automated validation of command options per platform #1953

Open gingerbeardman opened 6 years ago

gingerbeardman commented 6 years ago

I'm using tldr less and less because I'm on macOS X and most times I use it I find out the hard way that the arguments are different on macOS X, and I imagine the other supported platforms.

In the past I've filed an update, but that was a very painful process: https://github.com/tldr-pages/tldr/issues/1764

I just hit it again for find

So I'm recommending that you do an (automated?) audit of common against all the supported platforms. Without this sort of audit, there is too much friction and misinformation using tldr.

agnivade commented 6 years ago

Thanks for raising the issue. It has been discussed previously. Please look at my comment here -https://github.com/tldr-pages/tldr/issues/1423#issuecomment-312543639.

It would be ideal to have it working for all commands. But I think that's just practically impossible. And even getting it to work for a handful of commands is going to be more work than just manually verifying all the commands, IMO.

If someone is willing to take it up, that would be most welcome though. :smile:

gingerbeardman commented 6 years ago

I was thinking more of simple validation against the platform man pages. That is after all how I check and confirm the correct options when I encounter this issue.

Anyway, regardless of how they are verified, it remains that they do need to be.

agnivade commented 6 years ago

Is there an API or something through which man page options can be retrieved in a consumable format ?

gingerbeardman commented 6 years ago

Well you can do it easily on the command line?

man find | egrep -o '^\s{5}-.+?\s' | grep -v ',' | sort | uniq

this took me a couple of minutes to arrive at.

man to display the information egrep to show matches only from lines that start with an argument/switch/option grep to remove some lists of options aka bad matches sort to sort alphabetically uniq to remove duplicates

results:

     -Bmin 
     -Bnewer 
     -Btime 
     -E 
     -H 
     -L 
     -P 
     -X 
     -acl 
     -amin 
     -anewer 
     -atime 
     -cmin 
     -cnewer 
     -ctime 
     -d 
     -depth 
     -empty 
     -exec 
     -execdir 
     -f 
     -false 
     -flags 
     -fstype 
     -gid 
     -group 
     -ilname 
     -iname 
     -inum 
     -ipath 
     -iregex 
     -iwholename 
     -links 
     -lname 
     -ls 
     -maxdepth 
     -mindepth 
     -mmin 
     -mnewer 
     -mount 
     -mtime 
     -name 
     -newer 
     -newerXY 
     -not 
     -ok 
     -okdir 
     -path 
     -perm 
     -print 
     -prune 
     -regex 
     -s 
     -samefile 
     -size 
     -true 
     -type 
     -uid 
     -user 
     -wholename 
     -x 
     -xattr 
     -xattrname
gingerbeardman commented 6 years ago

Another thought would be to run each command with --help, -H, -h, --? or similar.

find -H

results:

usage: find [-H | -L | -P] [-EXdsx] [-f path] path ... [expression]
       find [-H | -L | -P] [-EXdsx] -f path [path ...] [expression]
agnivade commented 6 years ago

A bit hackish for my taste. But certainly better than nothing.

Anyone welcome to take a stab at this.

waldyrious commented 6 years ago

Using --help will probably produce lots of false negatives since that option usually presents only a summary of the available options. Parsing the manpage is more likely to result in a complete list. I did some experiments locally, and found the following combination to produce good results:

man find | grep -Poi ' -[a-z-]+' | tr -d ' ' | sort | uniq

As @agnivade says, it would be nicer if we could leverage some cleaner (machine-readable) input, rather than text-processing free-form content, but if it works well, the benefits would be definitely worth it.

Right now we have a single test configuration, but it's certainly possible to setup the Travis build to run in multiple OSes and test the command options depending on the folder of the command in question.

gingerbeardman commented 6 years ago

case in point, your modified command does not work on macOS because grep is different (I used egrep so it would work for everybody).

let me think on it some more, as well as your query in my old PR/issue.

waldyrious commented 6 years ago

Good point. I swapped to grep because your command didn't work at first in my machine! 😆

How about man find | egrep -oi ' -[a-z-]+' | tr -d ' ' | sort | uniq? (Confirmed working on my Linux machine.)

gingerbeardman commented 6 years ago

it was probably the {5} leading spaces (specific to find and macOS, no doubt)

man find | egrep -oi ' -[a-z-]+' | tr -d ' ' | sort | uniq

gives me this, incomplete list

--
-L
-depth
-exec
-mindepth
-name
-newer
-newerct
-o
-or
-print
-prune
-s
-type
-user

this modified version captures them all plus a couple of false positives

man find | egrep -oi " -[^,.'|.]+?\s" | tr -d ' ' | sort | uniq

results

-
-B*
-Bmin
-Bnewer
-Btime
-E
-H
-L
-P
-P]
-X
-acl
-amin
-and
-anewer
-atime
-cmin
-cnewer
-ctime
-d
-delete
-depth
-empty
-exec
-execdir
-f
-false
-flags
-follow
-fstype
-gid
-group
-ilname
-iname
-inum
-ipath
-iregex
-iwholename
-links
-lname
-ls
-maxdepth
-mindepth
-mmin
-mnewer
-mount
-mtime
-name
-newer
-newerXY
-newermm
-not
-ok
-okdir
-or
-path
-perm
-print
-print0
-prune
-regex
-s
-samefile
-size
-true
-type
-uid
-user
-wholename
-x
-xattr
-xattrname
-xdev
--
-L
-depth
-exec
-mindepth
-name
-newer
-newerct
-or
-prune
-type
-user
gingerbeardman commented 6 years ago

man find | egrep -oi " {2,}-[^,'|.]+?\s" | tr -d ' ' | sort | uniq

this grabs all bar two types on macOS:

  1. argument on it's own directly followed by a new line
  2. argument with underscores in name

sadly i've no more time right now to get improve it

-Bmin
-Bnewer
-Btime
-E
-H
-L
-P
-X
-acl
-amin
-anewer
-atime
-cmin
-cnewer
-ctime
-d
-depth
-empty
-exec
-execdir
-f
-false
-flags
-fstype
-gid
-group
-ilname
-iname
-inum
-ipath
-iregex
-iwholename
-links
-lname
-ls
-maxdepth
-mindepth
-mmin
-mnewer
-mount
-mtime
-name
-newer
-newerXY
-not
-ok
-okdir
-path
-perm
-print
-prune
-regex
-s
-samefile
-size
-true
-type
-uid
-user
-wholename
-x
-xattr
-xattrname
waldyrious commented 6 years ago

it was probably the {5} leading spaces (specific to find and macOS, no doubt)

Ah, that makes sense. Using \s+ rather than \s{5} (in the first command line you proposed) now does produce some output, buy also captures content that comes after the actual option names (this is irrelevant since we've evolved the command since then -- I'm just mentioning this for the record)

sadly i've no more time right now to get improve it

That's ok (this is all volunteer work after all). We already made some progress here, and anyone who decides to tackle this later on has a head start to base their work on. Cheers! 👍

sbrl commented 6 years ago

Sounds like everyone's have a bunch of issue with cross-compatibility. Here's my solution that uses awk for all the text processing, such that it (should) work on all systems that have awk:

man $command_name | awk 'BEGIN { RS=" " } /^\[?-{1,2}/ { gsub(/([\)\[\]]|\s+|[\.,;'"'"']\s*$)/, "", $0); if(length($0) > 0) { print $0; } }' | sort | uniq

I could have implemented the sort | uniq bit in awk too, but I was lazy :P