Open coriolinus opened 3 years ago
This is by design.
Since the format is specified as #[display("{instruction} {value}")]
, ContainsInstruction::from_str()
works as follows.
Instruction::from_str()
.u64::from_str()
.Because of this behavior, it will not work properly if the string of fields contains characters that separate the fields.
If there is only one field that may contain a space, it is possible to separate them by the last space by adding #[from_str(regex = ".*")]
as follows.
#[derive(parse_display::FromStr)]
#[display("{instruction} {value}")]
struct ContainsInstruction {
#[from_str(regex = ".*")]
instruction: Instruction,
value: u64,
}
For more complex cases, I think it is better to use a function that converts the beginning of the string to a value, such as a funtion that return IResult in nom crate, instead of a function that converts the entire string to a value, such as FromStr::from_str()
.
Instead of specifying a field that may contain spaces, you can also specify a field that is guaranteed not to contain spaces.
#[derive(parse_display::FromStr)]
#[display("{instruction} {value}")]
struct ContainsInstruction {
instruction: Instruction,
#[from_str(regex = r"[^\s]+")]
value: u64,
}
Thank you for the quick response. I'm not familiar with the internals of this library: would it be practical to re-implement ContainsInstruction::parse
with the following algorithm?
Instruction::parse
"turn off"
or the literal toggle
, returning the rest of the string via an internal APIu64::parse
In other words, construct a and use internally a parser which converts the beginning of a string and returns the unused parts, but present the whole-string parsing interface to the end user.
Obviously that's a pretty big feature request, but it would solve this issue. AFAICT it should be possible for regex-based sub-parsers to work properly under this design as well.
In the meantime, would you be willing to add documentation that this is a known limitation of the library?
Certainly, I think the above algorithm would be able to parse the string more accurately.
If I implement it, I would like to use a standard trait rather than the internal API to be able to use this algorithm even if struct contains fields that do not use parse-display
. (e.g. std::u8
)
a parser which converts the beginning of a string and returns the unused parts
Do you have any idea of existing standard trait that can be used for such purposes?
Adding the following features is also an option.
#[from_str(regex = auto)]
to instruction
field will have the same effect as if you wrote #[from_str(regex = "turn off|toggle")]
.Once Rust's specialization feature is stabilized, it may be able to omit #[from_str(regex = auto)]
.
This method has the following pros and cons.
Pros
Cons
Do you have any idea of existing standard trait that can be used for such purposes?
I don't know of any, unfortunately. AFAIK the general form would be something like
// returns `(parse result, unused portion of input)`
fn parse<'a, T: FromStr>(s: &'a str) -> (Result<T, <T as FromStr>::Err>, &'a str);
[edit] Note that in this function, the FromStr
implementation has to greedily capture as many characters as it needs from the start of the string. Very likely, a different trait would suit better there.
Adding
#[from_str(regex = auto)]
to instruction field
This is an interesting idea! Does this feature currently exist? I didn't see it documented.
It seems to me that it would be even more useful if this annotation could be applied to enum Instruction
instead of the field instruction
, so that we can decorate the declaration, not every usage. This feature, assuming it causes the test cases above to pass, plus appropriate mentions in the documentation, would be sufficient in my mind to close out this issue.
Adding #[from_str(regex = auto)] to instruction field
This is an interesting idea! Does this feature currently exist? I didn't see it documented.
This feature is not yet implemented.
It seems to me that it would be even more useful if this annotation could be applied to enum Instruction instead of the field instruction, so that we can decorate the declaration, not every usage.
I think so too, but I can't think of a way to allow omit #[from_str(regex = auto)]
.
This is the same for the algorithm described in https://github.com/frozenlib/parse-display/issues/9#issuecomment-740795611.
Since proc-macro cannot determine whether the field type implements the trait or not, you need to annotate whether use FromStr
or the different trait described in https://github.com/frozenlib/parse-display/issues/9#issuecomment-741748867.