Closed tlitetrasci closed 3 years ago
Looks like we're missing some validation checks here, thanks for reporting this. I expect we can roll this into a bugfix release.
Thanks for confirming that this is unexpected behavior. I'm looking forward to seeing the bugfix.
Hey @systemcatch , @jadchaar ,
I would like to help out with this one. Do you have any details on how you imagine the validation check?
You can probably carry out the check similar to how we validate errors in parser.py's parse()
function:
You can check the parsed values and raise an appropriate ParserError
. If you think there is a more appropriate place to put the validations, feel free to make that choice :)!
Hey again, could you check my attempt for the first issue (if token == "hh" then hour must be between 00 and 12):
https://github.com/ALee008/arrow/blob/3c09bb87f253271900906e75d8927a3c8620ce16/arrow/parser.py#L333
For the second issue I am not sure about the location. I have also added a question: https://github.com/ALee008/arrow/blob/3c09bb87f253271900906e75d8927a3c8620ce16/arrow/parser.py#L576
I am not even sure if this attempt matches your expectations at all :-).
Hi, has there been any update on this issue? I see @ALee008 has made some changes in a fork.
Hi @ALee008 and @tlitetrasci. Apologies on behalf of the entire Arrow team for not getting back to you on this sooner (some professional and personal events have popped up for all of us). I'll be taking a look at @ALee008's fork and reviewing it over the next 48 hours.
No worries, thanks for the update!
@anishnya No worries, I have been busy myself recently. I'll be awaiting your update.
Hey @ALee008. I've taken a closer look at the issues you've outlined. The token check appears to be correct. For the location, I think we'd prefer it to be in parse_token rather than in parse. Even though parse_token is only called by parse, I think it makes more sense to have the validation functionality within parse_token. In terms of the error message generated, I wouldn't worry about including the fmt or datetime_string strings. A simple message such as "hour token value must be between 0 and 12 inclusive for this (insert token name here)" should be good. Let me know if you need any further guidance and thank you for your patience and contribution to Arrow :).
@ALee008, that looks good to me. Once you get the tests added, feel free to make the PR. Let us know if you need any further guidance, or have any additional questions.
Hey @anishnya,
I removed the first ParseError because it would raise a ParseError if hour is not between 0 and 12, which is not desired.
I also added a test called test_parse_am
in test_parser.py
.
I push the latest code and added my adjustments. When running tox I got the following error:
Any ideas?
That does seem odd, especially since you aren't modifying the shift method. @jadchaar, I noticed the test is skipped if the dateutil version if below 2.7.1, that should be the only reason this fails right, especially if shift isn't being modified? @ALee008 do you mind checking what version of dateutil you have installed?
Hey @anishnya, I've got Version 2.8.1 installed
@ALee08, you're working out of the master branch of your forked repo correct? I'll pull it down and see what's happening.
Hey @anishnya , yes, that's what I did first (using the explanation here because I haven't done it before) How to update a forked Git repo.
I pulled down the branch but can't reproduce that error, can you update your branch to be equal with arrow master to see if that helps?
Actually @ALee008 how about you submit a PR and we can see if it fails on our CI infrastructure.
Hey @systemcatch, I updated my fork and added my changes. Test is still failing. My origin should be up-to-date now. As recommended I will submit a PR.
Issue Description
Arrow does not seem to perform validation on timestamps for unusual formats where the information conflicts.
The following code snippet runs just fine:
First of all, since
hh
is documented to go to a maximum value of 12, I expectarrow
to raise an error because the value is 14. Secondly, I expect arrow to notice that 14:00:00 (2 PM) conflicts with the string "AM".Is this behaviour intentional? Or is it a bug?
System Info