Open idpaterson opened 8 years ago
Merging #199 into v3.0 will decrease coverage by
-50.26%
. The diff coverage is70%
.
@@ Coverage Diff @@
## v3.0 #199 +/- ##
===========================================
- Coverage 78.04% 27.79% -50.26%
===========================================
Files 14 14
Lines 1567 1576 +9
Branches 288 291 +3
===========================================
- Hits 1223 438 -785
- Misses 252 1120 +868
+ Partials 92 18 -74
Impacted Files | Coverage Δ | |
---|---|---|
parsedatetime/context.py | 56% <70%> (-25.82%) |
:x: |
parsedatetime/pdt_locales/icu.py | 10.58% <0%> (-72.95%) |
:x: |
parsedatetime/init.py | 18.26% <0%> (-56.7%) |
:x: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 5b4fd8e...4283265. Read the comment docs.
Do you want me to review the changes as you make them or wait until you have the majority of them made?
That's up to you, I'm going to work on the cases that require additional support in the pytest setup first then port the remaining normal cases.
How does this look for an nlp test?
long_phrases:
sourceTime: 2013-08-01 21:25:00
cases:
- target: !nlpTarget
- phrase: "At 8PM on August 5th"
target: 2013-08-05 20:00:00
context: !pdtContext month | day | hour
- phrase: "next Friday at 9PM"
target: 2013-08-09 21:00:00
context: !pdtContext day | hour
- phrase: "in 5 minutes"
target: 2013-08-01 21:30:00
context: !pdtContext minute
- phrase: "next week"
target: 2013-08-08 09:00:00
context: !pdtContext week
phrases:
- >
I'm so excited!! At 8PM on August 5th i'm going to fly to
Florida. Then next Friday at 9PM i'm going to Dog n Bone!
And in 5 minutes I'm going to eat some food! Talk to you
next week.
@pdtFixture('multiple_dates.yml')
def test_long_phrases(cal, phrase, sourceTime, nlpTarget):
assert cal.nlp(phrase, sourceTime) == nlpTarget
The target normalization adds the startIndex
and endIndex
for each phrase automatically based on substrings, so unless startIndex
and endIndex
are specified in the data file we just have to avoid having a string appear twice in the text.
cal = <parsedatetime.Calendar object at 0x109962790>
phrase = "I'm so excited!! At 8PM on August 5th i'm going to fly to Florida. Then next Friday at 9PM i'm going to Dog n Bone! And in 5 minutes I'm going to eat some food! Talk to you next week."
sourceTime = datetime.datetime(2013, 8, 1, 21, 25)
nlpTarget = ((datetime.datetime(2013, 8, 5, 20, 0), pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY), 17, 37, 'At 8P...in 5 minutes'), (datetime.datetime(2013, 8, 8, 9, 0), pdtContext(accuracy=pdtContext.ACU_WEEK), 176, 185, 'next week'))
@pdtFixture('multiple_dates.yml')
def test_long_phrases(cal, phrase, sourceTime, nlpTarget):
> assert cal.nlp(phrase, sourceTime) == nlpTarget
E assert ((datetime.da... 'next week')) == ((datetime.dat... 'next week'))
E At index 0 diff: (datetime.datetime(2013, 8, 5, 20, 0), pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY | pdtContext.ACU_HOUR), 17, 37, 'At 8PM on August 5th') != (datetime.datetime(2013, 8, 5, 20, 0), pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY), 17, 37, 'At 8PM on August 5th')
E Full diff:
E ((datetime.datetime(2013, 8, 5, 20, 0),
E - pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY | pdtContext.ACU_HOUR),
E ? ----------------------
E + pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY),
E 17,
E 37,
E 'At 8PM on August 5th'),
E Detailed information truncated (15 more lines), use "-vv" to show
cal = <parsedatetime.Calendar object at 0x1109b3790>
phrase = "I'm so excited!! At 8PM on August 5th i'm going to fly to Florida. Then next Friday at 9PM i'm going to Dog n Bone! And in 5 minutes I'm going to eat some food! Talk to you next week."
sourceTime = datetime.datetime(2013, 8, 1, 21, 25)
nlpTarget = ((datetime.datetime(2013, 8, 5, 20, 0), pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY | pdtContext.ACU...riday at 9PM'), (datetime.datetime(2013, 8, 8, 9, 0), pdtContext(accuracy=pdtContext.ACU_WEEK), 176, 185, 'next week'))
@pdtFixture('multiple_dates.yml')
def test_long_phrases(cal, phrase, sourceTime, nlpTarget):
> assert cal.nlp(phrase, sourceTime) == nlpTarget
E assert ((datetime.da... 'next week')) == ((datetime.dat... 'next week'))
E At index 2 diff: (datetime.datetime(2013, 8, 1, 21, 30), pdtContext(accuracy=pdtContext.ACU_MIN), 122, 134, 'in 5 minutes') != (datetime.datetime(2013, 8, 8, 9, 0), pdtContext(accuracy=pdtContext.ACU_WEEK), 176, 185, 'next week')
E Left contains more items, first extra item: (datetime.datetime(2013, 8, 8, 9, 0), pdtContext(accuracy=pdtContext.ACU_WEEK), 176, 185, 'next week')
E Full diff:
E ((datetime.datetime(2013, 8, 5, 20, 0),
E pdtContext(accuracy=pdtContext.ACU_MONTH | pdtContext.ACU_DAY | pdtContext.ACU_HOUR),
E 17,
E 37,
E 'At 8PM on August 5th'),
E (datetime.datetime(2013, 8, 9, 21, 0),
E pdtContext(accuracy=pdtContext.ACU_DAY | pdtContext.ACU_HOUR),
E 73,
E 91,
E 'next Friday at 9PM'),
E - (datetime.datetime(2013, 8, 1, 21, 30),
E - pdtContext(accuracy=pdtContext.ACU_MIN),
E - 122,
E - 134,
E - 'in 5 minutes'),
E (datetime.datetime(2013, 8, 8, 9, 0),
E pdtContext(accuracy=pdtContext.ACU_WEEK),
E 176,
E 185,
E 'next week'))
I like it - it reads as easily as the others and doesn't require any repeated items. The "avoid having a string appear twice" constraint seems reasonable for a test suite
Yeah and it's only a constraint of convenience. There will need to be a test that includes multiples of the same string because that is an important thing to test, but that test will just need to specify startIndex and endIndex in the YAML.
nlp tests are now implemented. Some of the existing nlp tests were specific to nlp (e.g. multiple phrases in a string) and some were generic date and times.
A new tests.data.nlpTarget
class wraps data from the test and supports equality testing with nlp responses. It provides a consistent representation for tests with both normal and nlp targets (i.e. target: 2016-01-01 00:00:00
vs target: !nlpTarget ...
) and allows some flexibility for the python to modify the test data. For example, the following test updates the sourcePhrase after wrapping the original phrase in quotes:
@pytest.mark.parametrize('prefix,suffix', (('"', '"'), ("'", "'"), ('(', ')')))
@pdtFixture('simple_datetimes.yml', ['times', 'invalid_times', 'dates',
'invalid_dates'])
def test_simple_datetimes_wrapped(cal, phrase, sourceTime, nlpTarget, prefix,
suffix):
sourcePhrase = u'%s%s%s' % (prefix, phrase, suffix)
nlpTarget.sourcePhrase = sourcePhrase
assert cal.nlp(sourcePhrase, sourceTime) == nlpTarget
This allows the nlpTarget
to calculate the proper start and end index for the phrase.
The test data can also specify startIndex
but the documentation warns that this should only be done when required due to the string appearing more than once:
- target: !nlpTarget
- phrase: "today"
context: !pdtContext day
startIndex: 5
target: 2013-08-01 09:00:00
- phrase: "today"
context: !pdtContext day
startIndex: 26
target: 2013-08-01 09:00:00
phrases:
- "Yep, today was as good as today could be"
If everything specified a startIndex it would be more of a pain to write a phrase modifying test like test_simple_datetimes_wrapped
above since the startIndex would not be calculated and therefore would not match when the phrase is modified.
I ran into one case that failed (not in the original nlp tests, I pulled in some parse test cases). Since this pull request is strictly not going to fix anything in code I noted the failure with a FIXME:
def test_deltas(cal, phrase, sourceTime, nlpTarget):
# FIXME: these tests fail
if phrase in ('1855336.424 minutes ago',):
return
assert cal.nlp(phrase, sourceTime) == nlpTarget
All of these test utilities will need unit tests as well so I added them to the list at the top of this PR.
I toyed around with a !replace
constructor that would call datetime.replace()
on the source time but couldn't find a compelling enough reason to keep it, preferring absolute datetimes for clarity. It made the test data too abstract and I wasn't able to use it together with a delta. For example, "4pm 2 days from now" is a combination of a replacement (current time to 16:00:00) and a delta (plus 2 days).
I need to apologize for the lack of updates recently. A small paid project has taken away the time that I was using to work on parsedatetime. The following is not yet complete enough to commit.
Most recently I have reorganized the test classes to more easily add support for deriving times relative to source times and testing anything that does not require an explicit start date against multiple edge case dates. In the last comment I mentioned an attempt to represent a replacement of date components, I ended up with a succinct syntax for that:
day_suffixes:
cases:
- target: !replace 2008-08-22 xx:xx:xx
phrases:
- "August 22nd, 2008"
This test group has no sourceTime
which means that it will be parametrized against each of the edge case dates (Feb 29 leap year, Feb 28 non leap year, end of year pre-1970). I like with this syntax how the replacement makes it clear that the time component will match that of the source time. I was trying to use obvious source times in the original tests like 01:02:03
to show that (because you could easily miss it if all the times are something like 12:00:00), but this is clearer.
There are some slightly weird consequences of testing against multiple dates when you're dealing with partial dates, for example:
dates:
sourceTime: !replace xxxx-01-xx xx:xx:xx
cases:
- target: !replace xxxx-08-25 xx:xx:xx
context: !pdtContext month | day
phrases:
- "8/25"
In this case if the edge case source times were to fall after August 25 it would map to the next year, so the replacement is used to pin the source time to January. The alternate behavior could then be tested with a source time in October where the date will map to the following year:
dates_next_year:
sourceTime: !replace xxxx-10-xx xx:xx:xx
cases:
- target: !datedelta
sourceTime: !replace xxxx-08-25 xx:xx:xx
years: 1
context: !pdtContext month | day
phrases:
- "8/25"
Still plugging along albeit quite slowly now.
@idpaterson firstly no one working on an OSS project ever needs to apologize for doing paid work first - that's all part of what we do :)
Second, even if it wasn't paid, never worry about taking time to do something - we all work on this code because we love it and it solves an itch, but real life sometimes intrudes.
thanks for the update and for keeping the project moving forward!
TestSimpleOffsets was the first suite where I encountered tests that set flags on Constants
. I added the ability to specify those in YAML as we discussed earlier. This just uses setattr
to set any attribute to whatever value is specified.
In this example, the second test group uses a different option. I think the markup for the targets makes the difference in behavior between the two groups pretty clear.
offset_from_day_of_week:
sourceTime: *tuesdays
cases:
- target: !datedelta
days: 2
context: !pdtContext day
phrases:
- "Thursday"
- target: !datedelta
# TODO: Handling this as Thursday at 9 plus 1 hour seems
# unexpected.
sourceTime: !replace xxxx-xx-xx 09:00:00
days: 2
hours: 1
context: !pdtContext day | hour
phrases:
- "one hour from Thursday"
- target: !datedelta
# TODO: Handling this as Thursday at 9 minus 1 hour seems
# unexpected.
sourceTime: !replace xxxx-xx-xx 09:00:00
days: 2
hours: -1
context: !pdtContext day | hour
phrases:
- "one hour before Thursday"
offset_from_day_of_week_matching_source_time:
sourceTime: *tuesdays
options:
StartTimeFromSourceTime: True
cases:
- target: !datedelta
# TODO: Handling this as Thursday at the current time plus 1 hour
# seems unexpected.
days: 2
hours: 1
context: !pdtContext day | hour
phrases:
- "one hour from Thursday"
- target: !datedelta
# TODO: Handling this is Thursday at the current time minus 1 hour
# seems unexpected.
days: 2
hours: -1
context: !pdtContext day | hour
phrases:
- "one hour before Thursday"
I'm also leaving comments in the tests for behavior that seems unexpected and that may require discussion.
TestRanges introduced a new format I had not yet considered, a target containing multiple dates. It was trivial to call resolveTarget
recursively on lists to handle this case:
dates:
cases:
- target:
- !replace 2006-08-29 xx:xx:xx
- !replace 2006-09-02 xx:xx:xx
context: 1
phrases:
- "August 29, 2006 - September 2, 2006"
- "August 29 - September 2, 2006"
- "08/29/06 - 09/02/06"
Since this is not a very common thing I decided not to add startTarget
and endTarget
fixtures. Instead, the tests just destructure the target
list:
@pdtFixture('ranges.yml')
def test_times(calendar, phrase, sourceTime, target, context):
(startTarget, endTarget) = target
assert (calendar.evalRanges(phrase, sourceTime) ==
(startTarget.timetuple(), endTarget.timetuple(), context))
This is very similar to how the !nlpTarget
markup looks, so technically I could add a rangeTarget
class but I don't think it is necessary since the values that would be used here do not require any special logic.
Another bug was exposed by the new tests today where the old tests just happened to be written in a way that worked. For these particular phrases which are not valid dates, parse
returns the current datetime regardless of the sourceTime provided.
calendar = <parsedatetime.Calendar object at 0x1078b6150>, phrase = '18/35', sourceTime = datetime.datetime(1945, 12, 31, 3, 4, 5), context = pdtContext()
@pdtFixture('errors.yml')
def test_out_of_range(calendar, phrase, sourceTime, context):
> assert calendar.parse(phrase, sourceTime) == (sourceTime.timetuple(),
context)
E assert (time.struct_... pdtContext()) == (time.struct_t... pdtContext())
E At index 0 diff: time.struct_time(tm_year=2017, tm_mon=2, tm_mday=27, tm_hour=8, tm_min=42, tm_sec=47, tm_wday=0, tm_yday=58, tm_isdst=0) != time.struct_time(tm_year=1945, tm_mon=12, tm_mday=31, tm_hour=3, tm_min=4, tm_sec=5, tm_wday=0, tm_yday=365, tm_isdst=-1)
E Full diff:
E - (time.struct_time(tm_year=2017, tm_mon=2, tm_mday=27, tm_hour=8, tm_min=42, tm_sec=47, tm_wday=0, tm_yday=58, tm_isdst=0),
E ? -- ^ ^^ ^ - ^^ - ^
E + (time.struct_time(tm_year=1945, tm_mon=12, tm_mday=31, tm_hour=3, tm_min=4, tm_sec=5, tm_wday=0, tm_yday=365, tm_isdst=-1),
E ? ^^^ + ^^ ^ ^ ++ ^^
E pdtContext())
E Detailed information truncated (-2 more lines), use "-vv" to show
tests/test_errors.py:6: AssertionError
@bear, I noticed that the CI and coverage have not been running on recent commits for this pull request. Would you please check whether those are still operational, maybe I hit a limit on commits for this PR?
All of the original base/English language test cases are now implemented in the new format. There is still some work to do before this can be merged, so I would like to aim for the following plan:
I don't foresee any new features making it into 3.0 unless we get pull requests for them since there are already a number of bugs to fix and based on past experience trying to improve the German locale I suspect that more issues will pop up while working through the locales.
The test library now has 100% test coverage though I could not at least at this time justify creating separate unit tests for every aspect of the test lib. Before the additional tests, the parsedatetime tests provided 81% coverage of data.py and 92% coverage of fixtures.py. I have not yet worked on 100% test coverage of parsedatetime itself.
I think the last main step before merging this to v3.0 is to add comments in the test case data to aid translation.
As originally discussed in #196, this pull request begins the conversion from
unittest2
topy.test
.py.test
was already used to run the tests, but as an alternative tounittest2
it provides an opportunity for intriguing code-level instrumentation to make tests easier to write and more comprehensive.The following tests have been ported so far:
invalid_
prefixAdditional tests needed for the test utilities:
Scope of this PR
This pull request can be merged once tests have been rewritten to cover all of the current test cases. At that time there will still be a lot of work to do to improve and add tests especially in non-English locales; merging to v3 will allow pull requests for community contributions.
There will be no changes to the functionality of parsedatetime in this pull request other than instrumentation necessary for testing. For example, a "wildcard" flag was added to pdtContext to support tests that do not specify an explicit context.
Additional testing support
Testing against "today" and edge case dates
Most tests in pdt 2 were based on the current date as sourceTime rather than a predetermined time. That can be good and bad – I think it made the tests more confusing to reason about but it also helped to catch edge cases related to leap years, end of year, end of month, etc. I'm looking at you, #155! 99% of the time, "today" is just a normal time on a normal day and there is no guarantee that CI would happen to run on Feb 29. If it does and an error comes up, it's already too late to push out a fix.
For tests where the sourceTime is truly arbitrary (absolute dates and anything that can be expressed with deltas come to mind) it would be interesting to parametrize the function with a few edge case sourceTimes. Run it against Feb 29, Dec 31, Jan 1, today, etc so that those cases are caught intentionally rather than accidentally.
Testing NLP may require a list of targets
Some changes will need to be made to allow multiple target dates in test data.
Notes
This is a collaborative pull request
Any parsedatetime maintainers can commit on this PR and everyone is welcome to the discussion. I am also going to continue working on the test cases.
The old tests have been deleted from this branch
I deleted the old tests to avoid further housekeeping later; anyone who would graciously contribute to this pull request will need to use an alternate copy of the old tests for comparison.
The new test names and file structure does not necessarily match the old tests
Some previously separate test classes will be combined. For example, the cases for TestMultiple and TestDelta are both in deltas.yaml because the "multiples" were just multi-unit deltas. I used comments to separate them into sections within the data file.
Documentation
I will start a new issue for this but since the first bit of code is now public I wanted to express my preference to move away from epydoc. I have always found the HTML documentation very difficult to use... the content is well-written but the format is cumbersome and outdated. Mike, are you open to hosting documentation on ReadTheDocs or as a GitHub page via the gh_pages branch so that updates can go out automatically?
The test modules in this PR are documented for sphinx with Google style docstrings. The ReadTheDocs theme for sphinx makes for (in my opinion) a better presentation and a simpler, less syntactically-verbose code style. Some of the documentation that I have currently written in the Python docstrings would be better managed in separate rst files, but for now they should be helpful to anyone contributing to the tests. The implementation of that will arrive in a different pull request.
This change is