Closed Krylanc3lo closed 4 years ago
Hi!
I allow myself to bring some ideas for this problem. I could see that the author of JapScan try hard to periodically changes his methods to scramble the image and also added false images to make everything more complex.
So maybe it would be easier to use Selenium as it is now, but instead of recovering the source image and trying to decrypt it, take a screenshot of the scan page and crop to keep only the scan. With this method there is not necessarily a loss of quality because the images are mostly full-width on the page, and no longer need to keep up to date according to the encryption methods or other weird methods to prevent scraping.
I found a code implementation allowing to make a screenshot of an element of a page (it would be the #image div on JapScan for example): https://stackoverflow.com/a/15870708/6404595
It remains to be seen if there is no loss of quality or degradation of performance.
Hello guys, thanks for your help and sorry for the delay
I'm starting to work on screenshots and for now the results are encouraging, I just need to fix some details like resolution, etc
@Harkame Nice! I'm glad that my idea gives good result! If you need help, let me know, even if I'm not very good at Python... I'll be happy to help.
I have commit, I think its working
The quality of the screenshot is not very good for now. I don't know if its working for everyone but you can change selenium browser resolution in file jss_selenium.py at line 84 options.add_argument("window-size=1080,1920") with line 85 options.add_argument("window-size=1440,2560"). In the future I will use an parameter to manage that
Thanks a lot @Harkame as usual! I will try tonight Thank you as well @Gregory-Gerard for the idea ! It is indeed great
If I can help in anyway, just let me know :)
I just checked and it is working very well, thanks a lot to both of you :)
Also, out of curiosity if you do to mind. It seems you found a way to launch selenium without opening chrome ? I was trying to do that since a very long time, would you mind sharing the approach ?
Thanks again, I hope I will be able to help one day for anything :)
Your welcome !
For Chrome, its the option « headless »
options.add_argument("--headless") ... self.driver = webdriver.Chrome( self.driver_path, options=options, desired_capabilities=caps )
Hello,
I hope you are doing well. it seems that the images are scrambled again
Do you have any idea ?
Thanks again for your help