Keeping track of argument position

nordzilla commented 3 years ago

I asked on reddit about the possibility of enumerating the arguments that are parsed by lexopt.

As @blyxxyz pointed out, this is possible to do with the current implementation (0.1.0), but the approach is not recommended.

@blyxxyz also suggested that a potential option moving forward could be to add a method to the parser that the user could call at any time to retrieve the position of most-recently parsed argument.

Parser::position(&self) -> u64

I think this is great idea as a minimally-invasive, non-breaking change to the existing implementation that also keeps the API very minimalist and correct.

I believe that the position returned for the first-parsed argument would have to be 1, since the zero-index argument as a whole is technically the binary name. If someone were to call position() before calling next(), then 0 is the only logical choice for this scenario.

I would expect the functionality to be something like this:

let mut p = parse("-a -b -c value");
assert_eq!(p.position(), 0);

assert_eq!(p.next()?.unwrap(), Short('a'));
assert_eq!(p.position(), 1);

assert_eq!(p.next()?.unwrap(), Short('b'));
assert_eq!(p.position(), 2);

assert_eq!(p.next()?.unwrap(), Short('c'));
assert_eq!(p.position(), 3);

assert_eq!(p.value()?, "value");
assert_eq!(p.position(), 4);

let mut p = parse("-ab -cvalue");
assert_eq!(p.position(), 0);

assert_eq!(p.next()?.unwrap(), Short('a'));
assert_eq!(p.position(), 1);

assert_eq!(p.next()?.unwrap(), Short('b'));
assert_eq!(p.position(), 1);

assert_eq!(p.next()?.unwrap(), Short('c'));
assert_eq!(p.position(), 2);

assert_eq!(p.value()?, "value");
assert_eq!(p.position(), 2);

blyxxyz commented 3 years ago

Agreed on starting with 1, I came to the same conclusion when writing the gist.

Another issue is the return type.

I went with u64 as something that's definitely large enough.
But on Linux even an i32 is guaranteed to fit.
usize would be consistent with APIs like Iterator::position and enumerate and nth.
- I feel like usize is for indexing arrays and not much else. Do you think it would be common to use position()'s return value for indexing/slicing?
- Overflowing a usize would only be possible for an "artificial" argument list generated on the fly by an iterator, std::env::args_os returns values that already exist in memory.

For now I'm tending toward usize again.

And the behavior on overflow: it could panic or saturate or wrap or only panic once you actually call Parser::position or just do the default that depends on if you're in release mode. But that might be overthinking things.

nordzilla commented 3 years ago

usize would be consistent with APIs like Iterator::position and enumerate and nth.

I think that consistency with how Rust users expect items to be enumerated is compelling and definitely puts usize at the top of my list of choices for the type.

Do you think it would be common to use position()'s return value for indexing/slicing?

I can see legitimate use cases for indexing on these values, such as an error reporter that shows you the original input and highlights the exact argument that caused an issue (much like a compiler). Whether or not indexing would be common is a bit harder to answer.

Overflowing a usize would only be possible for an "artificial" argument list generated on the fly by an iterator, std::env::args_os returns values that already exist in memory.

I don't think that you are overthinking things. The overflow is technically still possible. Some handling of it would be the most correct thing to do.

I think panicking on the next call to position() after overflow would be acceptable here. No real user program is ever going to have this many arguments (famous last words), so gracefully returning a recoverable Error value in this scenario would be designing for a use case that will never happen in practice. But wrapping is the wrong behavior no matter how you look at it.

blyxxyz commented 3 years ago

Thank you for thinking with me. I've implemented this in 9ec959ff6f51f6bb7382be613f9f99867c528838.

I tentatively plan to do a release early next week.

blyxxyz / lexopt

Keeping track of argument position #2