Calamari-OCR / calamari

Line based ATR Engine based on OCRopy
GNU General Public License v3.0
1.04k stars 209 forks source link

License #3

Open amitdo opened 6 years ago

amitdo commented 6 years ago

Hi @ChWick!

OCR Engine based on OCRopy and Kraken

OCRopy and Kraken (and TenserFlow) are released under the Apache 2.0 license.

I ask you to reconsider the license choice of your project.

I hope this request will not be regarded as chutzpah.

amitdo commented 5 years ago

Can I get a reply please?

amitdo commented 5 years ago

Thank you!

stweil commented 1 year ago

I'm afraid that license change has to be reverted because tfaip is used.

stefanCCS commented 1 year ago

In my opinion it is not a good idea to change back to GPL (as copy-left-ed). Please stays with Apache 2.0 and if needed change code to avoid GPL. If this would be changed to GPL, or dependent code also will inherit this license and this might bring a lot of other projects in troubles!

stweil commented 1 year ago

Only projects which directly use the Calamari code would be affected, especially ocrd_calamari. Do you know other such projects? As far as I know ocrd_calamari is nowhere included, but only used via command line.

Projects which just call the command line interface(s) would not be affected by a license change.

Replacing tfaip might be difficult and a lot of work. I am afraid it might also break the compatiblity with existing models.

mikegerber commented 1 year ago

Yes, ocrd_calamari needs to go GPL too, if Calamari is GPL.

stefanCCS commented 1 year ago

Yes, ocrd_calamari needs to go GPL too, if Calamari is GPL.

Makes sense, but Calamari is Apache 2.0 at this moment.

stweil commented 1 year ago

That does not help. The current license statement for Calamari and ocrd_calamari are simply invalid because they violate GPL 3.

stweil commented 1 year ago

@amitdo or @ChWick, please reopen this issue.

stweil commented 1 year ago

If this would be changed to GPL, or dependent code also will inherit this license and this might bring a lot of other projects in troubles

Are there already projects which would be brought in trouble? For calamari and ocrd_calamari it would only be a small license fix without further implications.

mikegerber commented 1 year ago

Are there already projects which would be brought in trouble? For calamari and ocrd_calamari it would only be a small license fix without further implications.

I honestly don't know if I can just re-license my work on ocrd_calamari to GPL. As I was paid for that with a grant. Other than that I simply don't care, I just have to be compliant with all the constraints.

mikegerber commented 1 year ago

Personally, I have switched out the one GPL library I used[1] to something not using GPL, namely python-Levenshtein was replaced with rapidfuzz.

It is annoying but the best way is really not using any library licensed with GPL.

[1] Or potentially used, it came up with a PR.

mikegerber commented 1 year ago

I'm afraid that license change has to be reverted because tfaip is used.

I agree with this analysis. Question is if a. you need the license to be Apache because of the OCRopy and Kraken "legacy" b. you need the license to be GPL because of tfaip

is really the correct story here. If so, this is potentially an unresolvable problem.

andbue commented 1 year ago

Would tfaip having LGPL instead of GPL change anything?

mikegerber commented 1 year ago

Would tfaip having LGPL instead of GPL change anything?

I believe so, but I would check it more thoroughly to make sure. I just had done my layman's research when the issue came up with python-Levenshtein. We use the MIT-licensed rapidfuzz instead now, so it was resolved that way.

mikegerber commented 1 year ago

I believe so, but I would check it more thoroughly to make sure. I just had done my layman's research when the issue came up with python-Levenshtein. We use the MIT-licensed rapidfuzz instead now, so it was resolved that way.

I think LGPL would be fine, if you're just importing it (e.g. no changes/copy of the code, unless that comes in a separate LGPL-licensed repo somewhere.)

mikegerber commented 1 year ago

@andbue Thanks for opening the issue with tfaip. I have high hopes they just change their license. Wouldn't be the first time authors aren't aware of the complications of the GPL license and just wanted a free and open source license.

amitdo commented 1 year ago

https://tech.popdata.org/the-gpl-license-and-linking-still-unclear-after-30-years/

stweil commented 1 year ago

List of licenses for all dependencies (generated with pip-licenses):

(venv) stweil@ocr-02:~/src/github/Calamari-OCR/calamari$ pip-licenses 
 Name                          Version    License                                             
 GitPython                     3.1.24     BSD License                                         
 Keras-Preprocessing           1.1.2      MIT License                                         
 Markdown                      3.3.6      BSD License                                         
 Pillow                        8.4.0      Historical Permission Notice and Disclaimer (HPND)  
 PyWavelets                    1.2.0      MIT License                                         
 Werkzeug                      2.0.2      BSD License                                         
 XlsxWriter                    3.0.8      BSD License                                         
 absl-py                       0.15.0     Apache Software License                             
 adabelief-tf                  0.2.1      MIT License                                         
 appdirs                       1.4.4      MIT License                                         
 astunparse                    1.6.3      BSD License                                         
 cachetools                    4.2.4      MIT License                                         
 calamari-ocr                  2.2.2      Apache License 2.0                                  
 certifi                       2021.10.8  Mozilla Public License 2.0 (MPL 2.0)                
 charset-normalizer            2.0.9      MIT License                                         
 clang                         5.0        University of Illinois/NCSA Open Source License     
 colorama                      0.4.4      BSD License                                         
 dataclasses-json              0.5.5      MIT                                                 
 edit-distance                 1.0.5      Apache Software License                             
 editdistance                  0.6.0      MIT License                                         
 et-xmlfile                    1.1.0      MIT License                                         
 flatbuffers                   1.12       Apache Software License                             
 gast                          0.4.0      BSD License                                         
 gitdb                         4.0.9      BSD License                                         
 google-auth                   1.35.0     Apache Software License                             
 google-auth-oauthlib          0.4.6      Apache Software License                             
 google-pasta                  0.2.0      Apache Software License                             
 grpcio                        1.43.0     Apache Software License                             
 h5py                          3.1.0      BSD License                                         
 idna                          3.3        BSD License                                         
 imageio                       2.13.3     BSD License                                         
 importlib-metadata            4.9.0      Apache Software License                             
 keras                         2.7.0      Apache Software License                             
 libclang                      12.0.0     Apache Software License                             
 lxml                          4.9.2      BSD License                                         
 marshmallow                   3.14.1     MIT License                                         
 marshmallow-enum              1.5.1      MIT                                                 
 mypy-extensions               0.4.3      MIT License                                         
 networkx                      2.6.3      BSD License                                         
 nptyping                      1.4.4      MIT License                                         
 numpy                         1.19.5     BSD                                                 
 oauthlib                      3.1.1      BSD License                                         
 opencv-python-headless        4.5.4.60   MIT License                                         
 openpyxl                      3.0.9      MIT License                                         
 opt-einsum                    3.3.0      MIT                                                 
 packaging                     21.3       Apache Software License; BSD License                
 paiargparse                   1.1.2      MIT                                                 
 pandas                        1.3.5      BSD License                                         
 pkg_resources                 0.0.0      UNKNOWN                                             
 pooch                         1.4.0      BSD License                                         
 protobuf                      3.19.1     3-Clause BSD License                                
 pyasn1                        0.4.8      BSD License                                         
 pyasn1-modules                0.2.8      BSD License                                         
 pyparsing                     3.0.6      MIT License                                         
 python-Levenshtein            0.12.2     GNU General Public License v2 or later (GPLv2+)     
 python-bidi                   0.4.2      GNU Library or Lesser General Public License (LGPL) 
 python-dateutil               2.8.2      Apache Software License; BSD License                
 pytz                          2021.3     MIT License                                         
 requests                      2.26.0     Apache Software License                             
 requests-oauthlib             1.3.0      BSD License                                         
 rsa                           4.8        Apache Software License                             
 scikit-image                  0.19.1     BSD License                                         
 scipy                         1.7.3      BSD License                                         
 six                           1.15.0     MIT License                                         
 smmap                         5.0.0      BSD License                                         
 tabulate                      0.8.9      MIT License                                         
 tensorboard                   2.6.0      Apache Software License                             
 tensorboard-data-server       0.6.1      Apache Software License                             
 tensorboard-plugin-wit        1.8.0      Apache 2.0                                          
 tensorflow                    2.6.0      Apache Software License                             
 tensorflow-addons             0.15.0     Apache Software License                             
 tensorflow-estimator          2.7.0      Apache Software License                             
 tensorflow-io-gcs-filesystem  0.23.1     Apache Software License                             
 termcolor                     1.1.0      MIT License                                         
 tfaip                         1.2.6      GPL-v3.0                                            
 tifffile                      2021.11.2  BSD License                                         
 tqdm                          4.62.3     MIT License; Mozilla Public License 2.0 (MPL 2.0)   
 typeguard                     2.13.3     MIT License                                         
 typing-extensions             3.7.4.3    Python Software Foundation License                  
 typing-inspect                0.7.1      MIT License                                         
 typish                        1.9.3      MIT License                                         
 urllib3                       1.26.7     MIT License                                         
 wrapt                         1.12.1     BSD License                                         
 xlrd                          1.2.0      BSD License                                         
 zipp                          3.6.0      MIT License    

So currently tfaip and its dependency python-Levenshtein enforce GPL.

mikegerber commented 1 year ago

The current license statement for Calamari and ocrd_calamari are simply invalid because they violate GPL 3.

I've checked this and my tentative analysis is: ocrd_calamari still uses Calamari 1.0.x, which has a valid Apache License (not using tfaip yet, also no python-Levenshtein). So ocrd_calamari is fine at the moment, just the update to Calamari 2 is blocked 😶

mikegerber commented 1 year ago

I just realized that @ChWick (one of the/the main author of Calamari) is also the author of tfaip, so maybe this can be easily resolved if reaching out to him (which is probably going on already in Würzburg)

stweil commented 1 year ago

@TobiasGruening just has archived tfaip. It looks as there won't be a license change for it. And it also won't get updates any longer.

So Calamari either has to replace its dependency tfaip by something else which does not require GPL. Or it must use GPL 3, too (and create a maintained fork of tfaip).

TobiasGruening commented 1 year ago

Because the impact here is probably the biggest, I would like to explain our approach a little. Christoph worked with us for a good year and brought tfaip forward. We gave him the freedom to further develop Calamari in his spare time because he had a personal interest in it. We also decided together to make tfaip open source because we hoped that this software project would be used and that the community would benefit from it, but also that interesting and valuable implementations would flow back, which is why we deliberately put it under this license, which follows the copyleft concept. Since Tensorflow is to be phased out slowly in our company, we will also not spare any capacities for the further development of the open source variant of tfaip. I am sorry.

kba commented 1 year ago

I understand that you have a business to run and need to prioritize, which can mean stopping development on certain projects.

And I also respect that licensing of your work is your prerogative.

However the situation is that there are now two impediments to keeping Calamari (2.x) in the OCR-D ecosystem: the copyleft license and the fact that a core library of calamari is not maintained anymore.

Since you chose GPL to make sure that developments would flow back to you but have now decidede not to keep developing tfaip and are even phasing out TF in general - so tfaip won't be used in-house for long IIUC - what is the point in keeping it GPL? We're not planning on developing competing technology based on it, we only want to keep Calamari as an engine in the OCR-D tool stack.

bertsky commented 1 year ago

Thanks @TobiasGruening for explaining! To me, both tfaip and Calamari are superb software that have set quite an example (i.e. for TF data pipelining, for OCR respectively). We're lucky we got this much – so thanks for investing and sharing your efforts in the first place, Planet AI and @ChWick!

Going forward, I don't share @kba's skepticism. In my understanding, it would be correct to label Calamari 2.x and ocrd_calamari 2.x GPL, and then keeping them included in ocrd_all nevertheless.

I also don't think the licensing deviation from Ocropy is of concern. Calamari by being GPLed cannot in any way violate Apache'd old Ocropy.

I just hope that Calamari will keep getting developed and supported. (And that tfaip itself will be maintained minimally under a fork by someone, too.)