Closed jadchaar closed 5 years ago
I still get warnings when I pass a parser string to .get() Also here is a full example for disabling warnings if anyone needs: Using arrow 0.14.3
import arrow
import warnings
from arrow.factory import ArrowParseWarning
astring = "2019-07-29T13:58:44.460381Z"
# Show warning still?
arrow.get(astring,"YYYY-MM-DD[T]HH:mm:ss.S[Z]")
# Filter out the warnings
warnings.simplefilter("ignore", ArrowParseWarning)
# No warnings anymore
arrow.get(astring,"YYYY-MM-DD[T]HH:mm:ss.S[Z]")
Hey @sako0938, we made the conscious decision to include the warning on the arrow.get()
method with and without a format string since significant changes are coming to both.
@systemcatch we could maybe better target the format string warnings to the timestamp token X
?
@jadchaar @sako0938 that's going to be difficult since many of the changes apply with or without a format string, I think it's easier to just give a general warning despite it being more annoying.
For example in current arrow;
>>> arrow.get("garbage2017everywhere", "YYYY")
<Arrow [2017-01-01T00:00:00+00:00]>
>>> arrow.get("garbage2017everywhere")
<Arrow [2017-01-01T00:00:00+00:00]>
arrow 0.15.0 will raise errors for both these cases.
Sounds great. I mostly wanted to show how to remove the warnings for others since the issue didn't provide all the context needed at the time.
Hi, I'm a first-time user of arrow with latest version installed (0.14.4). Still can't understand from previous answers that why the example below still has warning. Is something still missing?
arrow.get(astring,"YYYY-MM-DD[T]HH:mm:ss.S[Z]")
/Users/whuang/Envs/p2/lib/python2.7/site-packages/arrow/factory.py:249: ArrowParseWarning: The .get() parsing method with a format string will parse more strictly in version 0.15.0.See https://github.com/crsmithdev/arrow/issues/612 for more details.
Hi, I'm a first-time user of arrow with latest version installed (0.14.4). Still can't understand from previous answers that why the example below still has warning. Is something still missing?
arrow.get(astring,"YYYY-MM-DD[T]HH:mm:ss.S[Z]") /Users/whuang/Envs/p2/lib/python2.7/site-packages/arrow/factory.py:249: ArrowParseWarning: The .get() parsing method with a format string will parse more strictly in version 0.15.0.See #612 for more details.
Hey @davix, on version 0.14.3+ there will always be a warning generated when arrow.get() does str parsing (with and without a format passed). To disable the warning please use @sako0938 or our solutions.
on version 0.14.3+ there will always be a warning generated when arrow.get() does str parsing (with and without a format passed)
From my understanding, a warning means that a user is doing it incorrectly or at least imperfectly. Why is there warning even if a format is provided, I'm curious? or is there a more proper command other that .get()?
From my understanding, a warning means that a user is doing it incorrectly or at least imperfectly. Why is there warning even if a format is provided, I'm curious? or is there a more proper command other that .get()?
Warnings can be also be used to notify users of upcoming changes to the package. The changes we've made to the .get() method can apply both with and without a format provided. In 0.15.0 these improvements will take effect and the warnings will disappear.
The flexible .get()
was in my opinion one of the best features of arrow, as it allowed for parsing of unknown or user generated time strings without any extra logic.
I wonder if it's possible you could keep the old functionality of .get()
in a new separate function? Maybe like .fuzzy_get()
or something.
@joinemm that functionality will still be there, indeed in version 0.15.0 more formats will parse in .get() without any other logic being needed than do currently.
I started seeing the message today; I'm with @joinemm - .get is one of the main reasons for me to use arrow. I'm very surprised by this move.
EDIT can at least rfc3339 be included in the default string parsing list without showing that warning, please?
@systemcatch really? But what about
arrow.get() will no longer parse natural language strings
Timestamp strings are no longer supported in the arrow.get() method
I feel like these two were very important
Timestamp strings are no longer supported in the arrow.get() method
That is the main function why I'm using arrow; parsing rfc3339 and ISO-8601 without needing to deal with TZ mockery in Python.
I'd settle with a .get_iso8601 method that doesn't need any formatting input, but deals with any variations of ISO-8601.
Timestamp strings are no longer supported in the arrow.get() method
Apologies, this statement needs to be tweaked.
We are dropping support for usage like: arrow.get("1565358758")
because we are adding support for the basic format in ISO 8601. This would make timestamp strings and the basic format ambiguous.
You can still use get
as follows: arrow.get(1565358758)
and if you need to parse a timestamp string, arrow.get("1565358758", "X")
.
Let us know if you all still have concerns and feedback about this. I have tweaked the original message to reflect my comments above.
@jadchaar Oh ok, looks like it was just badly worded then. So something like arrow.get("2002-10-02T10:00:00-05:00")
will still work as it used to?
@joinemm indeed :). We are not trying to break existing workflows, we are just trying to address core parsing issues that have given arrow
a bad name over the years.
arrow.get() will no longer parse natural language strings
As for this, the functionality will still remain intact if you provide a format string like this: arrow.get("Meet me at 2016-05-16T04:05:06.789120 at the stadium", "YYYY-MM-DDTHH:mm:ss.S" )
.
This will no longer work though: arrow.get("Meet me at 2016-05-16T04:05:06.789120 at the stadium")
. We made this decision because we wanted get()
without a format string to simply be for parsing date and time strings by themselves so that we can prevent any unintentional parsing issues (e.g. accepting blah2016
).
can at least rfc3339 be included in the default string parsing list without showing that warning, please?
@petermolnar the warning is there so people can see the full list of changes in this issue, thanks to that warning you and @joinemm showed us that the change log was poorly worded which should hopefully mean less confusion for others in the future.
The warning will be gone in 0.15.0 which should be released around the end of August.
The current version of the manual on readthedocs.io says:
Some ISO-8601 compliant strings are recognized and parsed without a format string:
arrow.get('2013-09-30T15:34:00.000-07:00') <Arrow [2013-09-30T15:34:00-07:00]>
However, that code now generates a warning. That forces an ugly choice. Do I put in the warning suppression code, which just adds clutter to my code and adds risk that some new future warning will be unintentionally suppressed, or do I specify the format string for iso8601, which clutters up my code, seems error prone, and also seems like domain specific knowledge that should be built in to Arrow.
I have instead decided to just forbid the use of versions 0.14. by adding the following to my setup.py file: `install_requires = ['arrow!=0.14.']` That eliminates the warning message without requiring me to add clutter to my code.
Hopefully things are better in 0.15.*.
This issue has come up a number of times in different date/time libraries; I last ran into it in MomentJS. The progression is surprisingly consistent:
The proposed solutions are usually:
Arrow.strict_mode()
Arrow.get(..., strict=True)
Discussion of the solutions devolves into camps wanting default strict versus default liberals. As maintainers are in the default strict camp, neither solution is implemented, the library is too hard to use, and starts a declining adoption rate until maintainers are no longer supported by their companies to maintain the package.
Is there any way to derail this path?
The current version of the manual on readthedocs.io says:
Some ISO-8601 compliant strings are recognized and parsed without a format string:
arrow.get('2013-09-30T15:34:00.000-07:00') <Arrow [2013-09-30T15:34:00-07:00]>
However, that code now generates a warning. That forces an ugly choice. Do I put in the warning suppression code, which just adds clutter to my code and adds risk that some new future warning will be unintentionally suppressed, or do I specify the format string for iso8601, which clutters up my code, seems error prone, and also seems like domain specific knowledge that should be built in to Arrow.
Hello @KenKundert that string does not need a format to be passed now or in the future, the warning will occur with or without a format string. This is so users are aware of the changes coming in 0.15.0 which affect both cases. ArrowParseWarning is specific and won't be used after 0.15.0.
@merriam
4. There are users that want to use the library in production and find a number of times where liberal parsing is too liberal, not raising errors. These users are now maintainers or supporters.
Indeed, take a look at the examples below. They're clearly wrong and introduce errors that are hard to find and debug. I assume you agree these should be fixed?
arrow.get('Jun-2019', ['MMM-YY', 'MMM-YYYY'])
<Arrow [2020-06-01T00:00:00+00:00]>
arrow.get('blabla102015').isoformat()
'1020-01-01T00:00:00+00:00'
arrow.get('13/4/2045')
<Arrow [2045-01-01T00:00:00+00:00]>
5. A change is proposed to make strict parsing the default. This is introduced as a breaking change, as the proponents do not plan to ever use liberal parsing. It is, curiously, always introduced in a way that extra warnings to be generated.
This change does not make strict parsing the default, it simply corrects obvious errors. Warnings are clearly needed to inform users about these changes, unfortunately due to the parser design it is difficult to provide specific warnings, hence the need for a more general one.
In [1]: import arrow
In [2]: arrow.get("2018", "YYYY")
/Users/ca/.local/share/virtualenvs/api-uUs12mY4/lib/python3.6/site-packages/arrow/factory.py:249: ArrowParseWarning: The .get() parsing method with a format string will parse more strictly in version 0.15.0.See https://github.com/crsmithdev/arrow/issues/612 for more details.
ArrowParseWarning,
Out[2]: <Arrow [2018-01-01T00:00:00+00:00]>
I understand the impetus to warn users about the upcoming change, but this feels very heavy-handed. A plain, simple example will yield the warning. Removing the warning requires 3 lines of code... granted it's not a lot, but that's a manual step everyone will have to take.
What about issuing a warning only when the code will need to change - this will give a chance to developers to prepare for the breaking change. Not sure how difficult this is considering the parser design.
Also, it would be nice for https://arrow.readthedocs.io/en/latest/ to include a migration guide.
@merriam
They're clearly wrong and introduce errors that are hard to find and debug. I assume you agree these should be fixed?
No. Really. I don't.
You can ask the library:
Aside from the global switch or parameter route, maybe one could slowly deprecate arrow.get()
with a DeprecationWarning, and replace it with arrow.quick_get()
, arrow.strict_get()
, and arrow.loose_get()
? It has the advantages of breaking everyone's code at a known future time and making the choice of fix an explicit, rather than implicit one. The guide could mention a work around hack like arrow.get = arrow.strict_get
for the lazy.
Taking away the messy parsing without replacement just causes chaos and makes code ugly. Keep code beautiful.
@charlax we do apologies for the heavy-handed warnings, but we decided to make the error very upfront due to the complexity of the parser design and to get a conversation going around the changes. We very much appreciate the feedback you all are providing, and we may have not gotten it if it had not been for the obvious warnings :). That said, we are looking to release 0.15.0
in the near future, so the warnings should be gone soon.
@merriam thanks for the feedback, but like Chris said above, we are simply trying to correct obvious errors that could confuse users. Just so that we are on the same page, what do you see as Arrow parsing being too strict? Can you give us a few examples? That will help us determine whether or not our changes are too strict (and whether or not a strict
flag is warranted)
Also, something to note is that @systemcatch and I have been working hard to address the feedback that users have given on Arrow parsing over the years (issues that have gone unaddressed for a long time). We are not trying to change the things that make Arrow great, but rather we are trying to fix obvious bugs. We are volunteers and have absolutely no corporate- or company-backing, we are doing this for the community :).
@jadchaar Thank you for your polite note. It sounds like pulling together a good set of test cases should fall on me. Would you be adverse to a pull request that added optional parameters, e.g., get(..., guess=True)
, as the the opposite of strict
?
Hey @merriam, thanks for the offer. I was actually thinking about an opposite flag to strict
as well. We could call it either guess
, loose
, or fuzzy
.
A contribution would be great, but keep in mind that a PR would have to be made against our Version 0.15.0
branch rather than master
since the changes have not been merged in yet (awaiting the merge of https://github.com/crsmithdev/arrow/pull/639 into the 0.15.0 branch). If we decide to add a flag to loosen the parsing standards, I think it will simply need to prevent the word boundary from being added to the regex here: https://github.com/crsmithdev/arrow/blob/Version-0.15.0/arrow/parser.py#L277.
Before you make the contribution, I think it would be useful for you to post a few of the test cases you have in mind (to warrant the flag) here so we can discuss :).
Thanks again for the feedback!
Hello. Can you avoid setting
warnings.simplefilter("always", ArrowParseWarning)
in your library, this clearly mess up warning set up. You can keep the Warning but let the users handle the filtering themself without having to import arrow.factory first.
Also, can you provide the example the format with optional parts (ex able to parse '2019-08-30T11:09:57+00:00' and '2019-08-30T10:57:25.931Z')
thanks
Hi @gsemet, on reflection we were probably over zealous with the warnings for these changes. But with python pre 3.7 not showing deprecation warnings by default we really wanted to make sure they were seen.
Anyway 0.15.0 will be released soon and the warnings will be gone.
A handful of days back, I made the mistake, it seems, of letting pip update Arrow. There were a handful of depreciation warnings we caught and remedied in testing, and then we rolled a major update to production. For some reason I'm not particularly interested in, we did not catch this one.
In the last hour, there have been 181,486 "arrow" warnings (this depreciation warning) logged. This is problematic.
I wanted a quick, temporary fix for this issue (pushing code to production is not quick) and tried setting the env variable to suppress this: PYTHONWARNINGS=ignore
. It had no effect.
Has anybody here tried swallowing these warnings with just an env variable? Any success?
Hey @salixrosa, we apologize for the issues you are facing. Can you try export PYTHONWARNINGS="ignore"
?
@jadchaar
We're working in a containerized environment where export
doesn't really make sense, as I understand things. We're setting the env variable the same way we're setting others that are used as early as the command that starts the python server; a line of PYTHONWARNINGS=ignore
should set the variable for all relevant processes. I also tried PYTHONWARNINGS=ignore::ArrowParseWarning
, which similarly had no effect. I was hoping the answer would simply be some other string for this variable :-/
Thanks, though!
maybe you can add a globla environment variable that would skip the warnings.simplefilter("always", ArrowParseWarning)
line in your library, so that you won't mess with stdout.
I planning to switch to an earlier version of arrow or maybe switch to another livrary, some of my tools really need a clean stdout.
Please release 0.15.0 so the warnings are actionable. Currently there's no path I can take to get rid of the warnings except of forcefully suppressing them. There's nothing I can fix or upgrade to and it's polluting my app's output.
I don't know how long to wait until you release version 0.15, but this warning message create a huge problem in ElasticStack (Logs) and required much more space to operate. Please release version 0.15!
Just revert to the previous version. There is the possibility to monkey patch arrow in order to ditch this ugly call to warning, but I would prefer if the kind author to release an intermediate version of 0.14 without this line :)
Hey all, thanks for the feedback. Chris and I have decided to release an intermediate version (0.14.7) like @gsemet suggested. Our goal was to facilitate a conversion about the 0.15.0 changes, and I think we have done just that. Thanks for the feedback and please keep it coming!
We are planning on releasing 0.15.0 in the next week or so.
Thanks for your work on this, @jadchaar!
Thank you for great work, @jadchaar
One the one hand it's great you are so explicit with warning users that the API is going to break next release, on the other hand next time it would be nice to have a direct clean action path for making the warnings go away (like releasing the broken-API 0.15.0 version at the same time as introducing the warning version 0.14.x).
Now I have to deal with the minor inconvenience of having to explicitly ignore these warnings in our test suite (all other warnings are seen as errors by our test runner), and then remove that again when we move to 0.15.0 in a few weeks. It's not the end of the world, but it's annoying.
At least we know the API breaks in 0.15 :) Thanks for the intermediate releases :)
Hey all, v0.15.0 has been released! We have made a huge effort to keep things as compatible as possible with previous versions, but this release may potentially break existing code. Here is the final change log:
arrow.get("1998-045")
, arrow.get("1998-45", "YYYY-DDD")
, arrow.get("1998-045", "YYYY-DDDD")
.YYYYMMDDTHHmmssZ
).humanize
week granularity translations for French, Russian and Swiss German locales.str
are no longer supported without a format string in the arrow.get()
method. This change was made to support the ISO 8601 basic format and to address bugs such as #447.# will NOT work in v0.15.0
arrow.get("1565358758")
arrow.get("1565358758.123413")
# will work in v0.15.0
arrow.get("1565358758", "X")
arrow.get("1565358758.123413", "X")
arrow.get(1565358758)
arrow.get(1565358758.123413)
ParserError
is raised.X
) will now match float timestamps of type str
: arrow.get(“1565358758.123415”, “X”)
.X
) will now only match on strings that strictly contain integers and floats, preventing incorrect matches.arrow.get()
returning an incorrect Arrow
object from a partial parsing match have been eliminated. The following issue have been addressed: #91, #196, #396, #434, #447, #456, #519, #538, #560.This release has been months in the making and @systemcatch and I have put in a ton of work into it. Please do not hesitate to reach out with any feedback or concerns. Thanks!
Hi,
First off, thank you for maintaining arrow, and thank you for addressing some long standing bugs.
Second, fallout from several recent arrow fixes have broken us downstream. This is understandable. However, I'm wondering if you would consider switching to semver. Namely:
We've found that major version bumps are a great way of informing downstream users that something has changed that may merit more investigation before deploying. This would allow us to, in this example, pin to arrow<2
, and pick up fixes without automatically picking up breaking changes.
Thanks again!
Hi @escapewindow, @systemcatch and I actually discussed this recently. We decided that we would like to release 1.0.0 once we find that arrow
has most basic features that you'd expect from a date and time library implemented. We are planning on switching to a full semantic versioning scheme once we add daylight savings time improvements (planned for 0.16.0). We wanted to get arrow
into a solid feature state before transitioning to semantic versioning. It is definitely on our minds as well :).
Great release, thanks!
Closing this issue because the v0.15.0 release seems to have gone well. Feel free to open a new issue if new problems arise. Thanks again everyone for contributing to the conversation and helping to make Arrow even better!
@jadchaar thousands of people/projects are already using the library. By not following standard versioning practices, you're causing headaches for other humans. Please do a major version bump for breaking changes.
@jmidyet Agreed and we're working on it.
While I agree with you, I must admit semver says something like « every version in 0.x can break or not, up to the lib ». Lot of packages are 0.x because their maintainer just does not want to deal with backward-forward compatibility. Yes human error happens, breaking change can sneak between two minor change, but semver has a purpose to minimize this. It is a pity some maintainers do not follow it, it makes updating very risky so we end up freezing lib to an old version, decreasing security.
Semver is not that hard. Use PBR to automatize version bump or any other similar lib. And a bit of self discipline.
By the way, pendulum is a very sexy lib, I am not saying it is better but it looks like it follows semver.
@gsemet @jmidyet we appreciate your input and apologize if the lack of semver has caused issues. I am a huge fan of semver myself, but Arrow has been in flux the past few years as maintainers have come and gone (I just join the project back in May), but we are heavily discussing a move to semver.
@systemcatch and I discussed the potential move recently, but we wanted to keep a 1.0.0 release until after we have complete DST support implemented. That is one of the big lacking features of Arrow that we want to tackle before officially saying this is a non-beta product. 0.15.0 was focused on refactoring and fixing the parsing behavior, and now we are moving focus to DST. If anyone would like to help us implement DST and get the ball rolling on a 1.0.0 release, we'd be happy to help and look over PRs!
Also, 1.0.0 should probably be Python 3+.
@jadchaar probably worth keeping support for 2.7 for 1.0.0
In the upcoming version 0.15.0 of
arrow
, we will be making a lot of changes to the behavior ofarrow.get()
to address a number of reported parsing bugs. We have outlined the changes below:Fixes
arrow.get()
returning an incorrect arrow object from a partial parsing match have been eliminated.For example,
These will raise a ParserError in 0.15.0.
arrow.get()
method without a format string:arrow.get("1565358758")
. This change was made to support the ISO 8601 basic format and to address bugs such as https://github.com/crsmithdev/arrow/issues/447.The following will still work as expected:
X
) will now matches float timestamps:arrow.get("1565358758.123415", "X")
.X
) will now only match on strings that strictly contain integers and floats, preventing incorrect matches.New Features
YYYYMMDDThhmmssZ
)"1998-045"
)Issues addressed
Development progress
You can view the progress of these changes here: https://github.com/crsmithdev/arrow/tree/Version-0.15.0.
Disable Warnings
To get rid of the ArrowParseWarning messages in 0.14.3 onwards, do the following: