Add some tokenization tests

polm commented 5 years ago

I saw there was interest in having more tests so I added some. There may be a better way to structure this but I figure it's a start.

Note these tests depend on IPADIC (the currently bundled dictionary) and will be skipped if it's not being used.

This also adds a note explaining a potential pitfall of the current testing strategy related to whitespace handling.

zackw commented 4 years ago

I really appreciate your having taken the time to add more tests, and I apologize for not looking at this PR for so long.

It looks like the CI failures are due to code formatting issues:

./test/test_basic.py:17:9: E126 continuation line over-indented for hanging indent
./test/test_basic.py:19:48: E261 at least two spaces before inline comment
./test/test_basic.py:19:80: E501 line too long (87 > 79 characters)
./test/test_basic.py:21:76: W291 trailing whitespace
./test/test_basic.py:22:80: E501 line too long (123 > 79 characters)
./test/test_basic.py:23:9: E123 closing bracket does not match indentation of opening bracket's line
./test/test_basic.py:31:80: E501 line too long (89 > 79 characters)
./test/test_basic.py:33:1: E302 expected 2 blank lines, found 1

I'm going to fix these and merge manually.

zackw commented 4 years ago

Rebased and made flake8 clean, will merge if https://travis-ci.org/SamuraiT/mecab-python3/builds/629128544 goes green.

zackw commented 4 years ago

That failed, but https://travis-ci.org/SamuraiT/mecab-python3/builds/629140831 worked. Patch merged as 0a0f1421b5c64c773aa8711727fb737dc4054664.

SamuraiT / mecab-python3

Add some tokenization tests #33