Closed nimishmistry closed 6 years ago
Hi, thanks for reporting. This is not intentional, we try to be lenient about whitespace and combining multiple value/abbreviation combinations with different delimiters, but this is way beyond those usecases :-) I'm sure it's a matter of tweaking the regex a little bit. Would you care to try to fix it? A pull request would be most welcome.
https://github.com/angularsen/UnitsNet/blob/master/UnitsNet/CustomCode/QuantityParser.cs https://github.com/angularsen/UnitsNet/blob/master/UnitsNet.Tests/CustomCode/ParseTests.cs
Thanks of the offer, will definitely attempt to fix this.
Awesome!
I took a look at this and found the problem is where the QuantityParser is trying to capture trailing, invalid data. It is only checking for alpha characters and, since comma is not an alpha character, it does not capture it. https://github.com/angularsen/UnitsNet/blob/45c747db693d1f2e8a38651b7a37f695d7450d95/UnitsNet/CustomCode/QuantityParser.cs#L71
Trying to make this capture more characters (like comma) could be problematic with the current design where the string can have multiple sets of valid units (such as "1m and 200mm"
). I think it would simplify things greatly and be more in keeping with framework parse methods if we did not allow multiple sets units. For example, double.TryParse("1+2")
returns false
throws an exception.
Proposal: Fail when parsing with multiple sets of unit. The expectation would be that consumer code would pre-parse, since more complex maths than addition could be required and handled by the consumer.
Thoughts?
I agree it may be unintuitive that we support parsing strings like 1 kg 500 g
. If I recall, the reasoning was originally that since we needed to support parsing 1' 5"
for feet/inches, then maybe it was better that we had a consistent parsing logic instead of special cases. In hindsight, I don't know of any other cases besides feet/inches that actually make much sense in combining value/unit pairs like that? If no one can come up with other usecases for this, then yes I think we should consider removing it in favor of only special-case parsing feet-inches to spare everyones' sanity. It would be a breaking change and have to be added to our #180 wishlist for a major version bump, but that is always something we can do. The list is significant already.
Also I don't understand why we allow ,
separators between value/unit pairs?
@"(and)?,?", // allow "and" & "," separators between quantities
Whitespace or adding the word "and" in between, sure, but when is it natural to parse a string like 1',1"
or 1kg,500g
? It seems contrived to me, but maybe I've just forgotten the reason it was added in the first place. And it should not have been valid to have empty value/unit entries such as in your example with multiple commas in a row. There is simply a lot of odd things going on here and I think we are trying to be way too lenient on the input, which is ultimately just going to cause problems when people expect it to handle all sorts of trash input.
Related PRs: https://github.com/angularsen/UnitsNet/pull/64 https://github.com/angularsen/UnitsNet/pull/81 https://github.com/angularsen/UnitsNet/pull/254
I have not yet read through these, so my recollection is still poor on design choices.
@maherkassim Was involved in much of this, do you have any comments or recollection of why we did these things?
@angularsen I believe that the main use case was feet & inches (eg. 1' 5"), but that we decided to keep the regex broad to avoid needing to add other special cases in the future (all of the relevant discussion can be seen on #81).
To clarify, I'm ok with having the regex be more restrictive and handling feet & inches as a special case. The only other special case I'm aware of is stone & pounds, but I'm not sure how often that's used (especially within the context of UnitsNet).
Also, maybe @gojanpaolo can shed more light on any related changes in #254?
Thanks for the heads up of stone and pounds @maherkassim.
@nimishmistry Anything I can help to move forward with a PR on this?
As noted above, this will be a breaking change and can only be merged in as part of a v4 and it will likely take some time with prerelease to get all the other changes in #180 included as well, but this is a good start at any rate. I just pushed a new v4
branch that we will create pull requests towards for breaking changes like this and those changes in #180.
This issue will be resolved in v4 release, see #487 .
Length.TryParse("2m,,,1", out length)
is parsed and the length is set to "2m". Is this intended?I'm trying to use TryParse for validation.