Closed LinguList closed 6 years ago
Yeah, I used pytest's test generation feature to split test_data.tsv
into individual tests. Ideally, I think, this is the right thing to do. But the run-time is a real pain. So I think we have two options: Either condense the test cases to the essential ones for testing all functionality - or if we not only want to test the code, but also the transcription systems, go back to running all sounds in one test.
Maybe at this stage, when TS are still in flux, we should have the test of all sounds in a run, and then later when things stabilize (if ever) use a couple of some 100 sounds to test the essential differences. I guess, most of the time, the test of the test_data.tsv is also while developing, to see whether one has screwed up things, so one could add an extra test in an extra file only doing this, ignoring the rest, or include the all-sounds-in-one-test as an extra test and only invoking this with pytest -k XXX
.
otoh, with each row in the tsv a separate test, error reporting seems a lot easier to follow up on:
_________ test_sounds[\xb2\xb2-tone-\xb2\xb2------flat mid-low mid-low tone-U+00b2 U+00b2] __________
bipa = <pyclts.transcriptionsystem.TranscriptionSystem object at 0x7f0615bd2fd0>, source = '²²'
type = 'tone', grapheme = '²²', nfd_normalized = '', clts_normalized = '', aliased = ''
generated = '', stressed = '', name = 'flat mid-low mid-low tone', codepoints = 'U+00b2 U+00b2'
def test_sounds(
bipa,
source,
type,
grapheme,
nfd_normalized,
clts_normalized,
aliased,
generated,
stressed,
name,
codepoints
):
"""Test on a large pre-assembled dataset whether everything is consistent"""
sound = bipa[source]
if sound.type not in ['unknownsound', 'marker']:
if nfd_normalized == '+':
assert bipa[source] != sound.source
if clts_normalized == "+":
assert sound.normalized
if aliased == '+':
assert sound.alias
if generated:
assert sound.generated
if stressed:
assert sound.stress
> assert sound.name == name
E AssertionError: assert 'flat from-mi...-mid-low tone' == 'flat mid-low mid-low tone'
E - flat from-mid-low to-mid-low tone
E ? ----- ---
E + flat mid-low mid-low tone
Btw.: For this PR I can't get the test suite to run without error either. Is this intended?
Ohoh, I think it is the old file for the test_data.tsv, and I forgot to push something or merged improperly. The point is: I had features like "mid", "high", "low" for tones, but also three points in time: from, via, to (like maximal tone in Chinese is 214, so we start mid, go down, go up). When switching to sets, and giving up linearity, I was forced to replace "mid low high tone" by "from-mid via-low to-high tone", and I think the error message is just reporting this instance. I'll make sure to clean this up.
I circumvented the problem of error-reporting by listing all errors later. This has the advantage that one can trace those cases which don't work and sees all at once, also in compact form. But I think an even easier solution will be to add a "quicktest" command to __main__.py
which will just run through all data and report what does not work. For development purposes, this should be sufficient.
While I see the difference between testing the code and testing the consistency of the data, I'd still do this within the pytest framework, i.e. have a pytest command line option --with-data
or the like, to include all tests in test_data.tsv
, and put all tests for the code somewhere else.
if that's possible, this would even be better!
Here is an example here for adding a command-line option to skip tests that run subprocesses:
This is to merge the current form, I'll then modify the features (which will require to modify the code a little bit).
What surprises me is that the tests now take so long to run. Is it because they are now called for each sound? If so, I'd be inclined to reduce the number of sounds, as it takes almost a minute to run the tests now, which makes it difficult to check if one changes the code.