finalfusion / finalfrontier

Context-sensitive word embeddings with subwords. In Rust.
https://finalfusion.github.io/finalfrontier
Other
87 stars 4 forks source link

Dealing with different set of command-line options #173

Closed danieldk closed 2 years ago

danieldk commented 3 years ago

I have implemented support for training floret embeddings, but the command-line gets a bit unwieldy. Floret is quite a bit different than what we have so far:

I see two ways forward:

  1. We add the necessary options and validations to ensure that no incompatible set of options is used.
  2. We add another level of subcommands, with only the relevant set of options, e.g.: finalfrontier skipgram floret, finalfrontier skipgram fasttext, finalfrontier skipgram buckets, finalfrontier skipgram explicit and the same for deps.

For (2), I am not sure if this is the best partitioning.

danieldk commented 3 years ago

I will make a proof of concept of (2) without the floret support to get a feeling for how well that works. I will probably also use structopt, I think it might simplify things.

sebpuetz commented 2 years ago

Phew, I feel like the nesting on (2) could become a bit unwieldy. On the other hand (1) is even worse from a UX perspective since the available information through --help is much much noisier.

So out of these two, I'd probably prefer (2). I'm not too keen on either solution, but I can't come up with anything other than perhaps moving back to separate binaries - which would make discoverability even worse. So (2) is probably the best choice?

danieldk commented 2 years ago

Phew, I feel like the nesting on (2) could become a bit unwieldy. On the other hand (1) is even worse from a UX perspective since the available information through --help is much much noisier.

I have somewhat working prototype. The syntax with multiple subcommands is somewhat interesting, e.g. you'd get something like:

finalfrontier --dims 200 corpus.txt embeddings.fifu deps --use-root explicit --minn 4

So, the structure is something like:

finalfrontier [common options] [model type options] [vocab type options]

It is not very unixy, but the really nice thing is that the help gets much less crowded. E.g.

finalfrontier --help

gives help for common options

finalfrontier deps --help

The options for dependency embeddings, etc. I guess the primary thing I really dislike is that the corpus and embedding filenames cannot be at the end. Well it can, but that really convolutes the data structures, whereas with this setup we can nicely generate all the options from clap 3's structopt-like support.

danieldk commented 2 years ago

Fixed by #174.