Closed bjeanes closed 1 year ago
Hi @bjeanes and, first of all, thanks for raising an issue!
This behaviour comes from the default behaviour of str::split
:
fn main() {
let s = "1-2-3---4";
for ss in s.split('-') {
println!("Split: {}", ss);
}
}
that leads to
Split: 1
Split: 2
Split: 3
Split:
Split:
Split: 4
To prevent this from happening, additional work must be done on top of that to ensure there are no elements which are, again, the same as the separator as in this case. As this is the default behaviour for a string split in Rust currently, I think that adding such checks to prevent this from happening to StringOrVecToVec
would cause a misunderstanding in other cases as this might not be expected, perhaps. However, I also see a point in actually doing so. I think we could add another Pattern
variant which could also support removing all the separators from appearing when (before) the converter function is invoked so that it would pass in your case. Would this work for you?
I think we could add another
Pattern
variant which could also support removing all the separators from appearing when (before) the converter function is invoked so that it would pass in your case. Would this work for you?
I think there are a few ways to do it: splitting on a regex, pattern to pre-process the string before splitting, or (prob my favourite) accept a fn to use in filter_map
with a default of |x| Some(x)
.
In the end, I worked around this by splitting as string then post-processing myself: https://github.com/bjeanes/modbus-mqtt/blob/83482329b55b3e8d03575a23f62e87c4e47cc2cc/sungrow-winets/src/lib.rs#L520-L549. Fortunately, I could take advantage of the fact that I know there will always be an even number of items and I need to consume them in pairs anyway, so chunks_exact
naturally drops the extra item.
But I do think this is worth resolving for future users, it's just not as urgent.
I do think the filter_map
route is pretty elegant and versatile.
I just realised using filter_map
instead of map in https://github.com/vityafx/serde-aux/blob/f064e78b81fac9e6abdc331604d23a81a8f7c901/src/field_attributes.rs#L1094 would avoid an extra field/parameter. But it would be a breaking change unless some fancy generics jumping could be done to accept either and do the right thing based on whether the fn returns Result<T, E>
or Option<Result<T, E>>
.
I've pushed some changes to address this issue in a simple and backwards-compatible manner: https://github.com/vityafx/serde-aux/commit/33ddb1e747800b749ee51e75fbe1e3eada90ba75
Hi @vityafx that definitely worked, though I wouldn't call it backwards compatible, as you've addded an argument to new()
, which is a public fn:
It seems you've already released it, but it would make sense to have new()
still only take 2 arguments, defaulting the skip_empty
to false
and requiring users like myself to call new(..).skip_empty()
.
Either way, thank you for the fix 🙏🏻
Hi @vityafx that definitely worked, though I wouldn't call it backwards compatible, as you've addded an argument to
new()
, which is a public fn:It seems you've already released it, but it would make sense to have
new()
still only take 2 arguments, defaulting theskip_empty
tofalse
and requiring users like myself to callnew(..).skip_empty()
.Either way, thank you for the fix 🙏🏻
Yes, that's my fault, I forgot about the change in "new". I have already bumped the major version as well, so yes, it is not backwards compatible indeed. I will think about the interface as you suggested, thanks!
@vityafx otherwise, if you're going to make a major version jump, consider just changing the filter
to return Option<Result<T, E>>
, though I do think there is probably a way to do that in a backwards compatible way. I might have a go at prototyping it actually... 🤔
I have to parse a string of hex-encoded, space-separated bytes in a string but the source API includes a trailing space:
"a1 b2 c3 d4 "
.If using a parser to turn the items into something other than a string type, such as an int, the parsing will fail due to the empty item.
Here is a test which demonstrates the problem:
Is there a way to resolve this or is this a bug with trailing separators?