Closed gameuser1982 closed 6 months ago
Update: It's my own damn fault. I installed botasaurus into a virtual environment I had previously installed Selenium into stupidly thinking they could co-exist without conflict. Wrong wrong wrong.
Solution: I uninstalled botasaurus from my virtual environment that I had originally used selenium for. Created a new virtual environment and ONLY installed botasaurus.
Now script scrapes as expected, though the certificate parsing errors still exist therefore I am keeping this issue open. Do these cert errors mean that the website is being connected to insecurely or can it be safely ignored?
Here is the new output:
(py311botasaurus) C:\py311botasaurus>python main.py
Running
[INFO] Downloading Chrome Driver. This is a one-time process. Download in progress...
DevTools listening on ws://127.0.0.1:2309/devtools/browser/1ea8b6bd-45cd-4b14-af05-ef74b8bf8484
[6340:14368:1224/155004.893:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate
[6340:14368:1224/155005.099:ERROR:cert_issuer_source_aia.cc(34)] Error parsing cert retrieved from AIA (as DER):
ERROR: Couldn't read tbsCertificate as SEQUENCE
ERROR: Failed parsing Certificate
Written
output/scrape_heading_task.json
(py311botasaurus) C:\py311botasaurus>
Yes, these keep occurring. Ignore them, Also it wasn't your fault, I yesterday released buggy Code (fixed now), that's why it occurred.
Wow nice! Thanks for the quick reply on this! This is a pretty awesome framework and the scraping side of things makes sense to me!
Thanks, a lot of awesomeness is on it's way that will seriously change the landscape of webscraping.
Description
I am just seeing a ton of exceptions trying to run the first Selenium scraping task that goes to https://www.omkar.cloud/ and grabs the h1 heading. It's the first Botasaurus script here:
It's the first script in what is botasaurus: https://www.omkar.cloud/botasaurus/docs/what-is-botasaurus/
Steps to Reproduce
Expected behavior: [What you expect to happen]
Scrape the h1 heading and store it as a string called heading which is returned once the function is called (and presumably automatically saved into a json file by the botasaurus framework)
Actual behavior: [What actually happens]
Lots of errors:
Reproduces how often: [What percentage of the time does it reproduce?]
It happens every time.
Additional context
I setup a virtual environment with botasaurus