JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
23k stars 3.03k forks source link

Improving the usability of EasyOCR #829

Open JulianOrteil opened 1 year ago

JulianOrteil commented 1 year ago

Hi @rkcosmos

Let me begin by applauding you and your team for building out such a versatile and excellent library. It is definitely worthy of the praise I see online and of my own praise I give this library as a user. That being said, I feel like there are multiple shortcomings that, if addressed, would push this library to new potential.

The following are things I would like to see addressed. I will be pointing out areas I believe to be deficient in a library of this caliber, but it is not my intention to demean or disregard the efforts made by the team and the community thus far. These are simply my opinions and any of them are open to being discussed.

This list is non-exhaustive. There are plenty of other things I think are deficient; however, they are not worth mentioning at this time. Others may also have their own reservations and are free to add to the comments on this issue. But, for right now, I'd like to make a proposal.

I am willing to donate much of my time to making this library live up to what I believe could be its maximum potential. I am willing to address every point I've put in this list plus help address what other users may put in the comments if the suggestions are worthwhile. That being said, I need some things from you.

  1. Serious commitment. The severe lack of descriptions of functionality in this library (comments) will absolutely affect my ability to understand what is going on. I will need help understanding certain areas of your code to improve it and describe it.
  2. Continuous feedback on my approaches to certain things. For example, if I am grouping modules in a directory in a way you don't agree with, let me know.
  3. Documenting a library like this is going to be painful and I don't have nearly the skill level in AI that you do. Your input in documenting methods and classes will be required in some instances.
  4. And finally, time. This will be a massive project that I will only be able to work on when I have the time. This may take a couple of months to properly build.
  5. f you agree to my undertaking this project, my changes will be breaking. It may be best to reserve v2.x.x for this, but that is a detail that can be left for a later date.

If you'd like to see an example of what I am proposing, I built a library called python-step-series for a motor controller family as well as ported all of the documentation from a separate website onto ReadTheDocs.

python-step-series Github python-step-series documentation (note there also exists a Japanese translation of the documentation. Choose 'ja' in the lower left menu)

Please let me know your thoughts or concerns. Jules

ystoll commented 1 year ago

Hi @JulianOrteil

I totally agree with you concerning the several aspects that you mentioned on your issue (see #823). I started to work on this on my side, to make the code more readable (turn systematically variables names to snake_case, introduce docstrings templates). I think that it will be great that we find a way to team up to improve this code, which on many aspects is very useful. More precisely, in priority, I would like to:

  1. Finish to make the code PEP compliant.
  2. Write exhaustive docstrings for all functions and classes.
  3. Write a proper Pytest based tests suit.

I don't have a pure developer background so on certain aspects of software engineering, you might be more skilled that I am. But on the other side, I might be more at ease on the A.I side that you are so, it seems that our profiles are somehow complementary.

Let me know if you are interested in collaborating with me.

Yannick

JulianOrteil commented 1 year ago

@ystoll Any help would be greatly appreciated. Again, this project is going to be quite extensive and exhaustive, so thank you for the offer. Before we make any progress, I'd like to wait a couple of days for the community and @rkcosmos to provide their input on this issue.

If you have any suggestions beyond what I provided (since they appear to also largely encompass what you have in #823), then please feel free to put them here so we can discuss them.

The plan I've been thinking of more-or-less requires a clean slate. In other words, it'd be much easier to bring the library up to speed where we can start from scratch than it is to try and incorporate our changes into the existing code base. That does not mean we have to rewrite the library's logic, but it does encompass redesigning how the library is structured and/or improving specific areas if we so choose.

Let me know your thoughts. Jules

ystoll commented 1 year ago

Hello @JulianOrteil,

I agree with you, it will be really nice to have a feedback from @rkcosmos concerning the changes we intend to make to the library.

One thing though: starting from a clean slate, although, at the end of the day, it might lead us to produce a more robust library, may be seen as a little "scary" by the main developers resulting in our PR being refused.

On the other hand, we have discussed this in my team and indeed, for instance, the class hierarchy could be largely improved... Say for instance we could use an ABC to clarify the hierarchy between the different kinds of converters, just a though... Very roughly, when I want to get inspired as to what can be considered as a good, pythonic code, I tend to glimpse toward the sklearn repo...

We will keep in touch, don't hesitate to let me know if you start to work on something, and I hope that we will get some news from @rkcosmos.

Cheers

Yannick

JulianOrteil commented 1 year ago

@ystoll

Great point. Hopefully, with our work being open source, @rkcosmos will have ample opportunity to provide feedback or suggestions for any concerns he may have (or directly contribute if he so wishes).

To help mitigate the above, why don't we discuss how the library should be structured and our roles before we start writing code? This will provide the community with further opportunity to join in if anybody so wishes and gives us a game plan before we start. I can start a Discussion in this repo to keep everything "centralized" and easy to access since this is better continued there.

I'll post a link to the discussion shortly.

Warm Regards, Jules

JulianOrteil commented 1 year ago

@ystoll I have opened #839 to pool our ideas and discussions into one place that is easy to access and open for anyone to view and comment on. Please use this conversation board going forward. I will keep this ticket open so anybody looking into this can easily see our progress.

My Best, Jules