Open notimp opened 2 weeks ago
Debug log shows "non empty source txt list"
OG comic language is Dutch (Netherlands).
Other OCR methods work well.
Python version is 3.9.13 torch version is: 2.1.0+cu121 torchvision version is: 0.16.0+cu121
Any idea what might be at fault here specifically?
Any help would be appreciated.
Other outdated libs in case you are wondering. anthropic 0.33.0 cachetools 5.4.0 con-Figparser 7.0.0 google-ai-generativelanguage 0.6.6 qooqle-api-python-client 2.140.0 google-auth 2.33.0 grpcio 1.65.4 grpcio-status 1.48.2 huggingface-hub 0.24.5 idna 3.7 imageio 2.34.2 importlib_resources 6.4.0 matplotlib 3.9.1.postl numpy 1.26.4 openai 1.40.3 pikepdf 9.1.1 pip 22.0.4 protobu-F 3.20.2 pydantic.core 2.20.1 PyQt6 6.6.1 PyQt6-Qt6 6.6.3 rd-Flib 6.3.2 setuptools 58.1.0 shapely 2.0.5 soupsieve 2.5 tokenizers 0.19.1 torch 2.1.0+cul21 torchvision 0.16.0+cul21 traits 6.3.2 trans-Formers 4.44.0 ultralytics 8.2.76 ultralytics—thop 2.0.0 urllib3 1.25.11
try to select the text area manually and press OCR in the right-click menu.
Debug log shows "non empty source txt list" OG comic language is Dutch (Netherlands). Other OCR methods work well. Python version is 3.9.13 torch version is: 2.1.0+cu121 torchvision version is: 0.16.0+cu121 Any idea what might be at fault here specifically? Any help would be appreciated. Other outdated libs in case you are wondering. anthropic 0.33.0 cachetools 5.4.0 con-Figparser 7.0.0 google-ai-generativelanguage 0.6.6 qooqle-api-python-client 2.140.0 google-auth 2.33.0 grpcio 1.65.4 grpcio-status 1.48.2 huggingface-hub 0.24.5 idna 3.7 imageio 2.34.2 importlib_resources 6.4.0 matplotlib 3.9.1.postl numpy 1.26.4 openai 1.40.3 pikepdf 9.1.1 pip 22.0.4 protobu-F 3.20.2 pydantic.core 2.20.1 PyQt6 6.6.1 PyQt6-Qt6 6.6.3 rd-Flib 6.3.2 setuptools 58.1.0 shapely 2.0.5 soupsieve 2.5 tokenizers 0.19.1 torch 2.1.0+cul21 torchvision 0.16.0+cul21 traits 6.3.2 trans-Formers 4.44.0 ultralytics 8.2.76 ultralytics—thop 2.0.0 urllib3 1.25.11
try to select the text area manually and press OCR in the right-click menu.
Same result, nothing happens. Log shows nothing.
Thanks for trying to figure this out with me. :)
can you provide the original image? and also try to upload it on the website google.com to the search area (camera button). If it recognizes it, then we will think that it is not also provide me with a full screenshot of the application along with a side menu of text blocks
I am the developer of this plugin. It's my problem if something went wrong.
Sure, its not the testimage, tried with several ones. And on google images, I get an ocred result back.
Outdated python something - is my best guess. :)
As others dont report it broken.
edit: Source and translation boxes stay empty, btw.
(Full pip list, in case its needed - I marked the libraries you might be touching:
can I take a screenshot of the program after running RUN? I'm wondering is this a text detector or OCR problem
Sure:
Type settings all set to "decode by program" btw. In Ballon Translator settings.
In case you were wondering about the progression bars -- they all just progress... :) With lower than usual time to finish (lower than usual for OCR done on mashine on cuda.)
Hmmmmm, try another OCR
Other OCR modules work, as stated before.
Could be a python version conflict. Could be a python libraries version conflict. Could be a language setting (so .com version of the site is loaded and then does a reload for the local version of the site? (although you are supplying language parameters))
Any way you could provide a version of the file I could run, that would help you debug more?
edit: Or the version numbers of the python (pip) libraries you are touching, so I could try to update those?
Other OCR modules work, as stated before.
Could be a python version conflict. Could be a python libraries version conflict. Could be a language setting (so .com version of the site is loaded and then does a reload for the local version of the site? (although you are supplying language parameters))
Any way you could provide a version of the file I could run, that would help you debug more?
edit: Or the version numbers of the python (pip) libraries you are touching, so I could try to update those?
Run the program like this
launch.py --debug
ah, thank you. will do.
ah, thank you. will do.
The question is, why don't you upgrade to Python 3.10.x? This can also be done so that you can store multiple versions at the same time, via pyenv win
Here is the log:
Will try a python 3.10 version install if you indicate that its the most likely culprit here, havent worked with pyenv win - might have a look at that also, but currently on this system I only have two projects running under python - so, trying to get both to run under 3.10.x without any form of version switching should be possible, although a bit of trial and error work on my part.
Here is the log:
pip list Will try a python 3.10 version install if you indicate that its the most likely culprit here, havent worked with pyenv win - might have a look at that also, but currently on this system I only have two projects running under python - so, trying to get both to run under 3.10.x without any form of version switching should be possible, although a bit of trial and error work on my part.
Above I threw pyenv win which allows you to globally and locally configure the version of Python used. That is, in the project folders it is 3.9, and in the translator folder it is 3.10. And so, I still think. Have you tested using other OCRs? Do they have the same result?
[ERROR ] ocr_google_lens:ocr:293 - OCR error: list index out of range
I think the problem is that in python 3.9 the processing method I used does not work. I'll check now
I have tested mit48px using cuda and using CPU and both run fine, no errors.
Thank you for checking, and baring with me and my issue. :)
Try replacing this file in modules and check. Should give more information. Let's see if I'm crooked, or if the changes in Python are not very good)
Practically the same log, unless I'm missing something (I replaced the file in modules/ocr)
Here is the log:
Hmmm... It didn't help us at all. Maybe we'll install 3.10? Or I have an offer that absolutely nothing can stop you. I once made my own auto-installer with built-in python. Just in case you're interested: https://github.com/bropines/Ballon-translator-portable
Of course, it will install it for you again, but you won’t have to think that something is wrong (the update checking algorithm is built into it)
I'll try to bring projects im runnting to 3.10.x an then report back. If everything fails I'll try your installer. Thank you for your support so far. :)
Good luck
Now running python version 3.10.11 didnt help.
I'll now purge my pip installs once again and make sure that I install Balloons Translator dependencies first, and then the ones of my other program. I'll report back.
I deleted all my pip installs, I deleted the entire ballons translator folder and downloaded it again. I fresh installed everything. I went into ballons translators config and set translator to google translate netherlands > german, I set ocr to google lens, I opened the test image folder an pressed run --- and same result.
Empty cleaned images, no text.
I'll edit in a pip list in a few minutes.
Next wild guess: Language (as in google lens site does something based on some language settings/your ip that arent english, even though you feed it english in the fake browser parameters), or Windows (OS) issue?
edit: Here is the promised pip list
I deleted all my pip installs, I deleted the entire ballons translator folder and downloaded it again. I fresh installed everything. I went into ballons translators config and set translator to google translate netherlands > german, I set ocr to google lens, I opened the test image folder an pressed run --- and same result.
Empty cleaned images, no text.
I'll edit in a pip list in a few minutes.
Next wild guess: Language (as in google lens site does something based on some language settings/your ip that arent english, even though you feed it english in the fake browser parameters), or Windows (OS) issue?
edit: Here is the promised pip list
pip list
I don't need the pip list. It is advisable to need a console log. ALSO tell me from which country you are launching. If from EU countries, then I need to release a fix with cookies. I'll try to connect to your country's VPN and test this theory.
Austria, next to germany, with many services, usually german service conditions/rules apply. I'll give you another console log with your special version of the module .py copied into the folder, before I go to bed for today. Will update this posting with it in 3 minutes or so.
edit:
Now going to sleep on my side. n8 :)
Also, if there is any other testing I can do on my part, tell me (if there is a way for me to confirm, that its ip related, f.e.).
So. I'm going to bed too. But if it's not difficult, check in the morning
I will add more logging to generally understand what we are receiving in the server response.
I will add more logging to generally understand what we are receiving in the server response.
thank you. :)
Listen. And you can check if you have this script. It's essentially the same thing, only in JS. If it works, then we’ll think about why our data isn’t being parsed
With a US VPN and your "special version of the ocr module" I finally got something new. :)
Debug Log:
Also, please dont make me install the java tool, I've got no experience with npm - and to get that working I would need some serious help... So I'm not using the java tool for now.
But its very interesting to see, that with a US VPN, or a JP VPN I get those new error messages (the message in the image (comic bubbles), is just the error message in the src (ocr) field repeated over and over and over again, but translated into german.. :) - but when I disconnect from said VPN, I get blank fields again.)
VPN used was proton vpn (free vpn), so the IP address is known to google, just fyi if that matters.
This is already interesting. Then (for some reason I only guessed now) could you install this library and make requests through it (that is, stupidly give it an image as an input)? Here I will find out what the problem is at a more simple level than through BT
And it would be even better if you wrote to me in telegram, if not difficult, so that I can help you more effectively
No to telegram, yes to the other stuff. :)
Here is the result of chrome lens py:
First three are with a US VPN
Last one (fourth one) without the VPN:
cmd Log:
== the commands when connected to a US VPN went through and returned the desired text. only without the vpn I get the "list index out of range" error.
I just replaced the debug google lens module in balloon translator, with the default one again, connected to a US vpn, tested if lens_scan would give text (it did), then tried to run BallonsTranslator again with the test images - and this time translation went through.
So here it is.
All I had to do was (maybe an install of python 3.10.11) and use a VPN, and the module works.
Thank you for your help in trouble shooting this. If you need anything else, I will monitor this issue for the next couple of days.
TLDR; People in Europe might need a US VPN for this to work. (Or use python 3.10.x, or install https://github.com/bropines/chrome-lens-py - which also did install some dependencies I dindt have so far.)
After that, while connected to a US VPN the google lens module will work for OCR.
I just replaced the test google lens module, with the default one again, connected to a US vpn, tested if lens_scan would give text (it did), then tried to run BallonsTranslator again with the test images - and this time translation went through.
So here it is.
All I had to do was (maybe an install of python 3.10.11) and use a VPN, and the module works.
Thank you for your help in trouble shooting this. If you need anything else, I will monitor this issue for the next couple of days.
TLDR; People in Europe might need a US VPN for this to work. (Or use python 3.10.x, or install bropines/chrome-lens-py - which also did install some dependencies I dindt have so far.)
After that, while connected to a US VPN the google lens module will work for OC
This is a temporary solution. Tell me your Internet service provider. I'll see what can be done to ensure that it produces a normal result. I assume that Google is introducing some kind of verification in your region. I will rewrite the module taking into account cookies for vpn, and after without vpn, it will use these cookies for the future so that you do not get an error.
I will finish my python module and embed it into the program as a library after I fix your problem. And you can even write to me on Facebook)
ISP is Magenta (== Deutsche Telekom, so the big honcho, Magenta is their ISP and Cell Provider brand.).
So that my issue is "fixed" if I use a VPN is thanks to your bropines/chrome-lens-py, module? (seems unlikely. :) )
Or thanks to the VPN? :)
I also dont use facebook (damn sure I dont... ;) ).
I also have another small request, I'd very much want you to implement. If possible please add a text parsing option that filters out all \s* (spaces, including soft line linebreaks I think...), and replaces them with one space.
Google lens often enters soft linebreaks into a text, where you dont want to be any -- if you do another quality control pass (re-edit the text bubbles basically) in Ballon Translator.
If there would be an option for the text of every blob, just being an endless line of text without linebreaks (per comic speech bubble), Ballon translator would do its auto formating purely based on textbox size, and you could format that further as a user during your quality control pass.
In my test edit using google lens ocr, taking out the linebreaks, that google lens put in there, by far was the most time consuming step.
So if you could add an option that parses the text first, removes all \s (thats spaces, soft line breaks, and even hard linebreaks (although \n arent included in \s), and replaces them with exactly one space), that would reduce the amount of time you have to put into removing all those false line breaks, google lens recognized. If google lens sets those line breaks using \n (so newline), you would have to replace \n with one space as well. (wouldnt be caught by \s)).
Still keep the separation of different text boxes recognized, so dont merge text, just remove all new line breaks on each and ever text bubble in a comic.
I'll put this in as a seperate request (issue) if you dont mind. But before I do I though I give you the chance to quickly react. If you cant or dont want to do that for reasons that aren immediately clear - say so and I wont create another ticket. :)
Make it an optional setting though.
By default google lense tries to format text using linebreaks so it fits the original text on paper in shape, I think. Some people still might like that. But if you do a quality control pass on your comic, removing all the line breaks that Google has put into the text is by far the most timeconsuming step (and you do it over an entire comic, and practically with every text bubble).
Thank you for considering it.
Now back to topic.
(I dont want to derail this issue posting in its final hours... :) )
Cheers, notimp
edit: Also, if adding those linebreaks is something Ballons Translator does during the resizing of a textbox, please ignore my request -- im not entirely familiar with the program yet, as I just started using it. (After I got google lens OCR running I did my first manual editing pass on a full comic, so I'm still learning.)
ISP is Magenta (== Deutsche Telekom, so the big honcho, Magenta is their ISP and Cell Provider brand.).
So that my issue is "fixed" if I use a VPN is thanks to your bropines/chrome-lens-py, module? (seems unlikely. :) )
Or thanks to the VPN? :)
I also dont use facebook (damn sure I dont... ;) ).
VPN solved your problem.
About line breaks. There's a handling newline setting in google ocr. Turn it on to remove and even if I messed up somewhere, it will remove all the hyphenation of the text. Try it.
dling newline setting in google ocr. Turn it on to remove and even if I messed up somew
Thank you, I will. Also thank you for the "VPN did it" answer. :)
My problem isnt with hypenated text though - but with text formated like this:
A man and his dog go home and sit on a couch.
The first four lines in that fashion is something I get like that as the text I'm working with more than a few times in a comic.
And its not just that, its also.
A man and\linebreak (aka newline) his dog go\newline home and sit\newline on a couch.\newline
that becomes a problem, if you resize textboxes during quality control (for example make it small and thin) and the linebreaks above still are honored - and the text cant reflow freely (I'm unsure if thats a source formating issue (as in googlelens spits it out that way -- it seems so, as I see some method in the madness (tries to place text roughly where it was placed on the original scan)), or that is the way that Ballon Translator handles reflow, after resizing textboxes (Ballon Translator itself puts those linebreaks in there, when you resize textboxes. Havent pinned down the issue yet. :) )
But I'll look into the feature you mentioned tomorrow. :)
it's better to record a gif or video with how it works for you. I'll try to help.
Off topic:
There's a handling newline setting in google ocr. Turn it on to remove
That didnt fix the issue, I'll open a new issue report on it (with images as illustration) and will link it in here afterwards.
edit: Here is the newly created issue ticket with illustrations: https://github.com/dmMaze/BallonsTranslator/issues/558
On second thought, its probably not google lens, but google translate or BallonsTransltor itself thats creating those newline characters. If you know which one of those it is, that alone as a feedback would be helpful as well.
edit: Off topic issue fixed in the linked issue thread. :)
~https://github.com/bropines/chrome-lens-py/releases/tag/v1.0.5~
~In the latest version, I added proxy support. And I tried to fix the cookie. I don't really want to break the code that is currently written. I think DmMaze will just transfer everything to the library as soon as it finishes. In fact, you can buy any proxy server from non-EU countries and everything will work. Someday I will fix this bug completely, but I don't know when yet.~
I've done) f253d2104f5bf3f03d406f77af73522f161563f2 @notimp
Noted, thank you.
Installing https://github.com/bropines/chrome-lens-py via pip now fails, because of apparaently a malformed script.
But you can still install it if you'd like to by using
pip install chrome-lens-py==1.0.2
I still use it
lens_scan drag image in all
as a check if the server I get connected to is good to use for google lens. :)
Noted, thank you.
Installing bropines/chrome-lens-py via pip now fails, because of apparaently a malformed script.
But you can still install it if you'd like to by using
pip install chrome-lens-py==1.0.2
I still use it
lens_scan drag image in all
as a check if the server I get connected to is good to use for google lens. :)
You just install the package, without updating. I will take this point into account in the readme. Here is the correct command
pip install -U chrome-lens-py
@notimp is your problem solved for now?
Debug log shows "non empty source txt list"
OG comic language is Dutch (Netherlands).
Other OCR methods work well.
Python version is 3.9.13 torch version is: 2.1.0+cu121 torchvision version is: 0.16.0+cu121
Any idea what might be at fault here specifically?
Any help would be appreciated.
Other outdated libs in case you are wondering.
``` anthropic 0.33.0 cachetools 5.4.0 con-Figparser 7.0.0 google-ai-generativelanguage 0.6.6 qooqle-api-python-client 2.140.0 google-auth 2.33.0 grpcio 1.65.4 grpcio-status 1.48.2 huggingface-hub 0.24.5 idna 3.7 imageio 2.34.2 importlib_resources 6.4.0 matplotlib 3.9.1.postl numpy 1.26.4 openai 1.40.3 pikepdf 9.1.1 pip 22.0.4 protobu-F 3.20.2 pydantic.core 2.20.1 PyQt6 6.6.1 PyQt6-Qt6 6.6.3 rd-Flib 6.3.2 setuptools 58.1.0 shapely 2.0.5 soupsieve 2.5 tokenizers 0.19.1 torch 2.1.0+cul21 torchvision 0.16.0+cul21 traits 6.3.2 trans-Formers 4.44.0 ultralytics 8.2.76 ultralytics—thop 2.0.0 urllib3 1.25.11 ```