gitanat / simple-ocr-opencv

A simple python OCR engine using opencv
GNU Affero General Public License v3.0
525 stars 175 forks source link

Merge unittest changes into master of original repo #18

Closed RedFantom closed 7 years ago

RedFantom commented 7 years ago

This branch contains all the commits that add Travis-CI, AppVeyor and codecov functionality as well as the unittests for the classes that were already in the master branch. Before merging, please make sure that Travis-CI, AppVeyor and codecov are enabled for this repository. Also, please note that the badges in the README are for my fork of the repository. These changes also depend on OpenCV 3, as explained in PR #16.

This PR is related to issue #14 . Squash merging this branch with a new commit message is recommended.

RedFantom commented 7 years ago

@goncalopp In my opinion, it's best to do this one first.

gitanat commented 7 years ago

Alright!

Do you have any downstream changes to testing.py? If you don't, I would prefer to follow the test/ directory structure for unit tests. I can migrate it, though

The tests are currently failing for me

======================================================================
ERROR: test_grounding_raise (__main__.Testing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "testing.py", line 48, in test_grounding_raise
    self.assertRaises(ValueError, lambda: grounder.ground(digits, segments, characters[:-4]))
  File "/usr/lib/python2.7/unittest/case.py", line 473, in assertRaises
    callableObj(*args, **kwargs)
  File "testing.py", line 48, in <lambda>
    self.assertRaises(ValueError, lambda: grounder.ground(digits, segments, characters[:-4]))
  File "/home/goncalopp/private/mydocs/programacao/python/simple-opencv-ocr/grounding.py", line 25, in ground
    raise Exception("segments/text length mismatch")
Exception: segments/text length mismatch

----------------------------------------------------------------------
Ran 7 tests in 1.712s

FAILED (errors=1)

Maybe this is a behavioral change in OpenCV 2/3? I'll take a closer look later today, but can you please confirm everything passes on your machine for this branch?

RedFantom commented 7 years ago

Yes, the directory structure would probably be better practice for placing the tests. I can look at migrating them this afternoon, putting them all in their own file.

The tests are failing, because I changed the Exception raised to a ValueError, because I thought that that's more logical to raise given it's an error with the values. The test checks if this error is raised, it is not a problem with OpenCV. The error is supposed to be raised, but then a ValueError, so the unittest can check for that. It's the only error, so that's a good sign: the tests succeed for OpenCV 2 as well as OpenCV 3. I was thinking, maybe there is some sort of mechanism possible to create alternate code paths for OpenCV 2 and OpenCV 3 in the same file? That would be great.

There are tests for the TerminalGrounder and the Improver class also in the respective branch, so I think it would be best to first merge this, as it creates the testing.py file with the basic tests before adding the tests for those classes.

Codecov, AppVeyor and Travis-CI must be enabled for this repository first if merging is to succeed correctly with the Codecov, AppVeyor and Travis-CI integration.

RedFantom commented 7 years ago

OK, I have split the testing.py file into parts following the guidelines, but I can't get the imports to work correctly, even though the PyCharm autocompletion doesn't detect any problems. I have committed the changes to separate branch, branched off the tests branch here. I can't figure out how to make it work...

gitanat commented 7 years ago

Thanks! Your tests in the directory run perfectly using nose. There's no trouble with the imports at all, as long as your CWD is the root

We should be using package level imports everywhere (from simple_opencv_ocr import grounding), but we need proper packaging first - let's do that separately

Btw, the convention for the directory name is tests, not test - but it's not too important. Feel free to point the PR branch to the new branch you created. I'll take a look into setting up CI

RedFantom commented 7 years ago

Well, that's interesting. I tried using nose, but the imports still failed, even with the right CWD. I guess there's a problem on my end then. I'll still change the name of the directory, for good practice, if you don't mind.

gitanat commented 7 years ago

Ok, I've finally managed to get travis working with opencv2. Can you please take a look into the commits in the redfantom-tests branch?

I've changed travis to use opencv2, and removed appveyor for now - let's add it to the opencv3 branch?

Regarding your question about having support for both opencv2/3, it's definitely doable within python, we just need to do version detection and add some conditions. I'm just not sure how to do the CI part in travis, as you can only install one version? But let me know if you figure it out, it would be great if could have everything merged. Otherwise, we can keep the separate branches

gitanat commented 7 years ago

It looks like we can actually run multiple builds with different environment variables, so it should be feasible to merge opencv3 into master with a bit of care :) Let's merge this and the TerminalGrounder PR first and then do that?

RedFantom commented 7 years ago

Yes, that sounds good. Using different environments, we could even test Python 3 support if that's feasible (it's more difficult because of the string encoding and decoding). It's almost ready for merging, the only question is: Do we keep the Exception when the amount of segments does not match the amount of classified characters, or do we change it to ValueError?

gitanat commented 7 years ago

we could even test Python 3 support if that's feasible That would be nice! As you said, it'll take some work with strings/bytes, but we can probably make it work

Do we keep the Exception when the amount of segments does not match the amount of classified characters, or do we change it to ValueError?

I've changed that on master. See https://github.com/goncalopp/simple-ocr-opencv/commit/445e28c32ae777c4ea9f188288f1cd2596216999

If you cherry-pick my commits I'll merge-squash this

RedFantom commented 7 years ago

I have cherry-picked your commits, and also added a Travis-CI build status image to the README.md file in ce81892. From the branch tests in my fork of the repository I have created a new branch, versions, to which I have published a commit that adds OpenCV version checking before any OpenCV version specific functions are called. I think we should merge that in a separate Pull Request, to keep things clear? Edit: using cv2.__version__, it is possible to retrieve a version number in str format, but the python-opencv package from the Ubuntu repositories provides $Rev 4227 as version number instead of 3.x.x or 2.x.x, so that complicates things.

gitanat commented 7 years ago

Thanks!

added a Travis-CI build status image to the README.md file in ce81892

Will cherry-pick

Edit: noticed you already included it, thanks!

I think we should merge that in a separate Pull Request, to keep things clear?

Sounds good. Getting the tests in first means it'll be easier to detect regressions

I've opened #20 to track opencv2/3 compat