goncalopp / simple-ocr-opencv

A simple python OCR engine using opencv
GNU Affero General Public License v3.0
524 stars 175 forks source link

Open Source License? #13

Closed zuphilip closed 7 years ago

zuphilip commented 7 years ago

Can the code be used under some open source license?

Maybe, you even want to choose a OSS license and make it explicit in the repo.

RedFantom commented 7 years ago

You could check here, but please note what specific commit is licensed under the AGPLv3. All code found in my fork of the repo is under this license, but only a single point in history in this repository is.

zuphilip commented 7 years ago

Okay, I see. Then I guess that I prefer your fork, which seems to have also some other improvements (e.g. Travis).

It might be a little confusing then (for people like me) with the two forks and only a comment in one direction...

RedFantom commented 7 years ago

My fork was created with the intention of creating a PyPI package from it, so it can be used more easily as a library in other projects. However, I do not think that it's fit for release just yet. I hope to be working on it again soon.

goncalopp commented 7 years ago

Let me give a bit of perspective on all this.

This code was started as an experiment - something I did to learn about OCR and machine learning, nearly 5 years ago. It eventually grew enough that I started structuring it a bit more carefully, and added some docstrings here and there, hoping it would help other people, but the reality is that this code was never meant to be run by non-developers.

There's no stable API, no unit tests, no release cycle, no pypy package, nothing. @RedFantom was, in fact, the first person that showed a serious interest in using this code for non-educational purposes.

If you need a battle-tested open source OCR system, I really feel like I should point you to tesseract.

With all this being said, if after taking a look at this code you still want to make use of it and turn it into a stable library, for whatever reason, I'm happy to make this repository community managed, licensed under the AGPL, and add you and @RedFantom as contributors, if you so wish, so we avoid having different forks and people re-inventing the wheel. I can add a pre-alpha pypy package, if it helps your usecase.

I'm also more than happy to help you go through the code. It's mostly harmless, but some parts are more hairy than they should :)

RedFantom commented 7 years ago

If we can keep running AppVeyor, Travis-CI and codecov (all free for open-source projects), I'm all for using a single repository so people do not have to search for the code.

About Tesseract: that is indeed a good OCR solution, but the Python bindings available on PyPI are pytesser (pre-alpha with v0.0.1) and tesserocr, which requires libtesseract and libleptonica, which makes it difficult to run on Windows. opencv-python just works on Windows, which makes it perfect for my usecase, that's why I am so interested in this code.

goncalopp commented 7 years ago

@RedFantom Travis sounds good. I'm not familiar with Appveyor and codecov. Is Appveyor just to make sure the thing works on Windows? As long as they don't require changes to the source, I don't mind having them. I'll probably setup coverage.py an tox as well, so I'm not dependent on having a network connection.

@zuphilip Let us know your preferences as well

zuphilip commented 7 years ago

Hey, thank you for all these information and the invitation to collaborate. I feel honored, but currently I can't help you here much.

I just stumbled on your project by the blog post and tried to understand the README file, reported the typos and then tried to understand a little more, but before that wanted to make sure the project is some kind of open source.

However, I think a lot of people saw your repo (184 stars!) and/or are watching the changes (24 watchers). These numbers alone are quite impressive! I guess also your traffic is high (also it is not visible for me).

I would like to add your repo and/or the fork to the list of awsome ocr with some sentence explaining the idea behind it. However, personally, I don't have any plan besides that for your repo. (Also I am not involved in this I wanted to say, that any collaboration between the two forks and even avoiding different forks sounds still very appreciable.)


Disclaimer and Advertisements: I know tesseract pretty well and we are also providing a Windows installer for Tesseract, see https://github.com/UB-Mannheim/tesseract/wiki. Maybe, you should try pytesseract for a Python binding for Tesseract. Personally, I am helping to maintain ocropy, which is another (battle-proven) OCR software and it is written in Python. I frequently use the docker image on my Windows machine for testing and running ocropy.

RedFantom commented 7 years ago

@goncalopp AppVeyor is for CI on a Windows platform and codecov is to keep track of the amount of code that is under unittests. Both require a .yml file, but I've already added them in my fork of the repository. The source must stay the same, as is with all testing. If you have an AppVeyor, Travis-CI and Codecov account at the ready (you can just log in using GitHub), I'll create a PR, is that OK?

@zuphilip While tesseract indeed provides an installer for Windows, it is too difficult for my end-users to install two programs and get everything set up correctly. For the GSF Parser, user-friendliness is a keyword for me. opencv-python is easily packaged using PyInstaller.

goncalopp commented 7 years ago

@RedFantom Sounds good. I've created https://github.com/goncalopp/simple-ocr-opencv/issues/14 to track this, so we don't hijack this issue :)

@zuphilip Yes, it's reasonably popular, probably because it's simple enough to understand in a hour or so :) It looks like its getting 50 views per day now. Feel free to add it to the list. I'm going to figure out a licensing model that works for me, and likely license this repository under the AGPL. I'll leave this issue open until I clear this out

goncalopp commented 7 years ago

This repository is now licensed under the AGPL Thanks @RedFantom and @zuphilip for pushing to make this happen :)

zuphilip commented 7 years ago

🎉 Thank you both for the hard work during the last weeks!