cnumr / ecoindex_scrap_python

Ecoindex_scraper module provides a way to scrape data from given website while simulating a real web browser
Other
2 stars 4 forks source link

Add an option to be compatible with a specific Chrome/Chromium version #55

Closed geoffreyarthaud closed 1 year ago

geoffreyarthaud commented 1 year ago

ecoindex_scrap_python relies on undetected-chromedriver to use the browser. And a chrome driver is specific to a major version of Chrome/Chromium.

undetected-chromedriver has an option version_main=xx to set this major version and download/patch the desired version.

It would be great if ecoindex_scrap could allow to set this version :

cannot connect to chrome at 127.0.0.1:36697
from session not created: This version of ChromeDriver only supports Chrome version 108
Current browser version is 107.0.5304.121

I'm using Chromium with flatpak, whose version is not exactly synced with last version of Chrome.

Many thanks for this project ! :+1:

vvatelot commented 1 year ago

🙏🏻 Praise: Thank you for your contribution @geoffreyarthaud ! I greatly appreciate it, moreover this request is really interesting and could help me with a problem I have on production ! I try to implement this quickly. Do you have a documentation somewhere or a clue on how to implement it ?

geoffreyarthaud commented 1 year ago

From undetected-chromedriver README, you could specify the version like this :

# use specific (older) version
driver = uc.Chrome(options=options, version_main=94)  # version_main allows to specify your chrome version instead of following chrome global version

I see the driver initialization in scrap.py

To get the version of chrome, one need to parse the result of this command :

> chrome --version
Chromium 107.0.5304.121 

to get the main version 107 here.

I guess I could make a PR soon, if you accept contributions.

vvatelot commented 1 year ago

@geoffreyarthaud I totaly accept contributions ! If you have time to do it, be my guest ! 🙂

vvatelot commented 1 year ago

Hello @geoffreyarthaud your idea was so great and I need to go fast (I have erratic behaviour with the ecoindex API in production because of this !) so I made a PR. "On my laptop" it works... And yours ?

geoffreyarthaud commented 1 year ago

Hello @vvatelot !

I think It should be ok. I directly use ecoindex_cli for now. I think a future version ecoindex_cli could automatically detect the version of Chrome and use this evolution.

Good job ! Thanks !