Closed edouard-lopez closed 1 year ago
The package isn't present in project dependencies in the pyproject.toml
. You need to install, BUT since version 3.4.0
the v2
has been removed:
3.4.0
Big update! be careful as it -potentially- could break your code. … cleanup removed compat,v2 files and tests folder
You can install 3.2.0
❯ poetry add undetected_chromedriver@3.2.0
Hello @edouard-lopez thanks for your contribution. undetected-chromedriver is a dependency of ecoindex_scrap_python package.
In the poetry.lock
file it is frozen to 3.2.1 version.
Did you run a poetry update
? That could explain the upgrade to the 3.4 version. I am waiting for the undetected-chromedriver to fix a bug on the 3.4 version to upgrade
It's pin in the scraper, but not the CLI:
https://github.com/cnumr/ecoindex_cli/blob/f213a95113fa631b6e8fbac3d680f8081c638c61/poetry.lock#L216
undetected-chromedriver
No, this means that ecoindex_scraper requires version ^3.1.6 (this is consistent with ecoindex_scrap configuration) but the version that is really installed is the 3.1.7 (confirmed here)
Can you try to make (with a fresh install)
poetry install && poetry show
and copy paste the result here ?
From the git clone I already have, I got
❯ poetry install && poetry show
Updating dependencies
Resolving dependencies... (12.1s)
Writing lock file
Package operations: 0 installs, 1 update, 0 removals
• Updating undetected-chromedriver (3.4.4 -> 3.4.5)
Installing the current project: ecoindex-cli (v2.15.2)
async-generator 1.10 Async generators and context managers for Python 3.5+
attrs 22.2.0 Classes Without Boilerplate
automat 22.10.0 Self-service finite-state machines for the programmer on the go.
black 22.12.0 The uncompromising code formatter.
certifi 2022.12.7 Python package for providing Mozilla's CA Bundle.
cffi 1.15.1 Foreign Function Interface for Python calling C code.
charset-normalizer 3.0.1 The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
click 8.1.3 Composable command line interface toolkit
click-spinner 0.1.10 Spinner for Click
constantly 15.1.0 Symbolic constants in Python
contourpy 1.0.7 Python library for calculating contours of 2D quadrilateral grids
coverage 7.1.0 Code coverage measurement for Python
cryptography 39.0.1 cryptography is a package which provides cryptographic recipes and primitives to Python developers.
cssselect 1.2.0 cssselect parses CSS3 Selectors and translates them to XPath 1.0
cycler 0.11.0 Composable style cycles
ecoindex 5.4.1 Ecoindex module provides a simple way to measure the Ecoindex score based on the 3 parameters: The DOM elements of the page, the size of the page and the number of external...
ecoindex-scraper 2.13.1 Ecoindex_scraper module provides a way to scrape data from given website while simulating a real web browser
exceptiongroup 1.1.0 Backport of PEP 654 (exception groups)
filelock 3.9.0 A platform independent file lock.
fonttools 4.38.0 Tools to manipulate font files
h11 0.14.0 A pure-Python, bring-your-own-I/O implementation of HTTP/1.1
hyperlink 21.0.0 A featureful, immutable, and correct URL for Python.
idna 3.4 Internationalized Domain Names in Applications (IDNA)
incremental 22.10.0 "A small library that versions your Python projects."
iniconfig 2.0.0 brain-dead simple config-ini parsing
itemadapter 0.7.0 Common interface for data container classes
itemloaders 1.0.6 Base library for scrapy's ItemLoader
jinja2 3.1.2 A very fast and expressive template engine.
jmespath 1.0.1 JSON Matching Expressions
kiwisolver 1.4.4 A fast implementation of the Cassowary constraint solver
lxml 4.9.2 Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
markupsafe 2.1.2 Safely add untrusted strings to HTML/XML markup.
matplotlib 3.6.3 Python plotting package
mypy-extensions 1.0.0 Type system extensions for programs checked with the mypy type checker.
numpy 1.24.2 Fundamental package for array computing in Python
outcome 1.2.0 Capture the outcome of Python function calls.
packaging 23.0 Core utilities for Python packages
pandas 1.5.3 Powerful data structures for data analysis, time series, and statistics
parsel 1.7.0 Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
pathspec 0.11.0 Utility library for gitignore style pattern matching of file paths.
pillow 9.4.0 Python Imaging Library (Fork)
platformdirs 3.0.0 A small Python package for determining appropriate platform-specific dirs, e.g. a "user data dir".
pluggy 1.0.0 plugin and hook calling mechanisms for python
protego 0.2.1 Pure-Python robots.txt parser with support for modern conventions
pyasn1 0.4.8 ASN.1 types and codecs
pyasn1-modules 0.2.8 A collection of ASN.1-based protocols modules.
pycparser 2.21 C parser in Python
pydantic 1.10.4 Data validation and settings management using python type hints
pydispatcher 2.0.6 Multi-Producer Multi-Consumer Observer Pattern for Python
pyopenssl 23.0.0 Python wrapper module around the OpenSSL library
pyparsing 3.0.9 pyparsing module - Classes and methods to define and execute parsing grammars
pysocks 1.7.1 A Python SOCKS client module. See https://github.com/Anorov/PySocks for more information.
pytest 7.2.1 pytest: simple powerful testing with Python
pytest-cov 4.0.0 Pytest plugin for measuring coverage.
python-dateutil 2.8.2 Extensions to the standard Python datetime module
pytz 2022.7.1 World timezone definitions, modern and historical
pyyaml 6.0 YAML parser and emitter for Python
queuelib 1.6.2 Collection of persistent (disk-based) and non-persistent (memory-based) queues
requests 2.28.2 Python HTTP for Humans.
requests-file 1.5.1 File transport adapter for Requests
scrapy 2.8.0 A high-level Web Crawling and Web Scraping framework
selenium 4.8.0
service-identity 21.1.0 Service identity verification for pyOpenSSL & cryptography.
setuptools-scm 7.1.0 the blessed package to manage your versions by scm tags
six 1.16.0 Python 2 and 3 compatibility utilities
sniffio 1.3.0 Sniff out which async library your code is running under
sortedcontainers 2.4.0 Sorted Containers -- Sorted List, Sorted Dict, Sorted Set
tldextract 3.4.0 Accurately separates a URL's subdomain, domain, and public suffix, using the Public Suffix List (PSL). By default, this includes the public ICANN TLDs and their exceptions....
tomli 2.0.1 A lil' TOML parser
trio 0.22.0 A friendly Python library for async concurrency and I/O
trio-websocket 0.9.2 WebSocket library for Trio
twisted 22.10.0 An asynchronous networking framework written in Python
typer 0.7.0 Typer, build great CLIs. Easy to code. Based on Python type hints.
typing-extensions 4.4.0 Backported and Experimental Type Hints for Python 3.7+
undetected-chromedriver 3.4.5 ('Selenium.webdriver.Chrome replacement with compatiblity for Brave, and other Chromium based browsers.', 'Not triggered by CloudFlare/Imperva/hCaptcha and such.', 'NOTE: r...
urllib3 1.26.14 HTTP library with thread-safe connection pooling, file post, and more.
w3lib 2.1.1 Library of web-related functions
websockets 10.4 An implementation of the WebSocket Protocol (RFC 6455 & 7692)
wsproto 1.2.0 WebSockets state-machine based protocol implementation
zope.interface 5.5.2 Interfaces for Python
From a new git clone I have different issue:
❯ git clone git@github.com:cnumr/ecoindex_cli.git ecoindex_cli2 explorations
Cloning into 'ecoindex_cli2'...
remote: Enumerating objects: 835, done.
remote: Counting objects: 100% (515/515), done.
remote: Compressing objects: 100% (242/242), done.
remote: Total 835 (delta 354), reused 388 (delta 269), pack-reused 320
Receiving objects: 100% (835/835), 1.00 MiB | 346.00 KiB/s, done.
Resolving deltas: 100% (516/516), done.
~/projects/explorations
❯ cd ecoindex_cli2/ explorations
~/projects/explorations/ecoindex_cli2 main
❯ ecoindex_cli2
~/projects/explorations/ecoindex_cli2 main
❯ poetry install && poetry show > show.txt ecoindex_cli2
Creating virtualenv ecoindex-cli-jdRZEe1_-py3.10 in /home/edouard/.cache/pypoetry/virtualenvs
Installing dependencies from lock file
SolverProblemError
Because no versions of scrapy match >=2.5.0,<2.7.1 || >2.7.1,<3.0.0
and scrapy (2.7.1) depends on zope.interface (>=5.1.0), scrapy (>=2.5.0,<3.0.0) requires zope.interface (>=5.1.0).
So, because no versions of zope.interface match >=5.1.0
and ecoindex-cli depends on Scrapy (^2.5.0), version solving failed.
at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/puzzle/solver.py:241 in _solve
237│ packages = result.packages
238│ except OverrideNeeded as e:
239│ return self.solve_in_compatibility_mode(e.overrides, use_latest=use_latest)
240│ except SolveFailure as e:
→ 241│ raise SolverProblemError(e)
242│
243│ results = dict(
244│ depth_first_search(
245│ PackageNode(self._package, packages), aggregate_package_nodes
~/projects/explorations/ecoindex_cli2 main
Oups wasn't in a virtual env, but got same issue
❯ poetry shell ecoindex_cli2
Spawning shell within /home/edouard/.cache/pypoetry/virtualenvs/ecoindex-cli-jdRZEe1_-py3.10
source /home/edouard/.cache/pypoetry/virtualenvs/ecoindex-cli-jdRZEe1_-py3.10/bin/activate.fish
greeting hyouman!
~/projects/explorations/ecoindex_cli2 main
❯ source /home/edouard/.cache/pypoetry/virtualenvs/ecoindex-cli-jdRZEe1_-py3.10/bin/activate.fish
~/projects/explorations/ecoindex_cli2 main
ecoindex-cli-jdRZEe1_-py3.10 ❯ poetry install && poetry show > show.txt ecoindex_cli2
Installing dependencies from lock file
SolverProblemError
Because no versions of scrapy match >=2.5.0,<2.7.1 || >2.7.1,<3.0.0
and scrapy (2.7.1) depends on zope.interface (>=5.1.0), scrapy (>=2.5.0,<3.0.0) requires zope.interface (>=5.1.0).
So, because no versions of zope.interface match >=5.1.0
and ecoindex-cli depends on Scrapy (^2.5.0), version solving failed.
at ~/.local/share/pypoetry/venv/lib/python3.10/site-packages/poetry/puzzle/solver.py:241 in _solve
237│ packages = result.packages
238│ except OverrideNeeded as e:
239│ return self.solve_in_compatibility_mode(e.overrides, use_latest=use_latest)
240│ except SolveFailure as e:
→ 241│ raise SolverProblemError(e)
242│
243│ results = dict(
244│ depth_first_search(
245│ PackageNode(self._package, packages), aggregate_package_nodes
Python version is:
ecoindex-cli-jdRZEe1_-py3.10 ❯ python --version ecoindex_cli2
Python 3.10.7
OK, I reproduce the issue. Nevermind, I am focusing on uprading https://github.com/cnumr/ecoindex_scrap_python/ to the undetected_chromedriver 3.4.5 version.
This upgrade (3.2 -> 3.4) should have been a major release as this is a breaking change. I do not agree with their version and release management but I have to deal with it :disappointed:
Now, the 3.4.5 introduces a new behavior, it first open the chrome://welcome
page and I try to bypass it...
closed by #227
Hello @edouard-lopez this issue should be closed by the latest version in which I upgraded undetected chromedriver to 3.4.5 (and froze it !)
https://pypi.org/project/ecoindex-cli/2.16.1/
Can you confirm this is OK ?
The undetected-chromedriver
issue is fixed!
❯ pip uninstall undetected_chromedriver
❯ pip uninstall ecoindex-cli
❯ pip install --user -U ecoindex-cli
Checking versions
ecoindex
❯ pip freeze | grep ecoindex
ecoindex==5.4.1
ecoindex-cli==2.16.1
ecoindex-scraper==2.14.0
~/projects/explorations/ecoindex_cli main
undetected-chromedriver
❯ pip freeze | grep und
command-not-found==0.3
undetected-chromedriver==3.4.5
❯ ecoindex-cli analyze --url http://manomano.fr/
📁️ Urls recorded in file `/tmp/ecoindex-cli/input/manomano.fr.csv`
There are 1 url(s), do you want to process? [Y/n]:
1 urls for 1 window size with 8 maximum workers
Processing [####################################] 1/1 100%
Errors found: please look at /tmp/ecoindex-cli/logs/manomano.fr.log)
…
IndexError: list index out of range
Fixed the IndexError: list index out of range
issue by passing --chrome-version
as mentioned in #215
❯ ecoindex-cli analyze \
--url http://manomano.fr/ \
--chrome-version (google-chrome --version | grep --only -P '(?<=\\s)\\d{3}')
Thanks for your feedback. I will try to improve the logger...
What happened?
Can't run the CLI due to unfound
undetected_chromedriver
dependenciesVersion
Above 3.6
What OS do you use?
Linux
urls
No response
Relevant log output
No response
Code of Conduct