CivicTechTO / toronto-bids

4 stars 2 forks source link

Issues running the `rfp_scraper` on Windows from a `.py` file but not when running in `.ipynb` #11

Open Arpanio opened 1 year ago

Arpanio commented 1 year ago

I ran into the following error when trying to convert the rfp_scraper.ipynb notebook to a python script:

DevTools listening on ws://127.0.0.1:61525/devtools/browser/104a4d12-5f62-4e59-ab41-de5c2538219f [7972:17088:0309/193728.241:ERROR:cert_verify_proc_builtin.cc(677)] CertVerifyProcBuiltin for accounts.google.com failed: ----- Certificate i=1 (CN=Selenium Wire CA) ----- ERROR: No matching issuer found

[7972:4932:0309/193729.351:ERROR:device_event_log_impl.cc(218)] [19:37:29.347] USB: usb_device_handle_win.cc:1046 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F) [7972:17088:0309/193740.995:ERROR:cert_verify_proc_builtin.cc(677)] CertVerifyProcBuiltin for optimizationguide-pa.googleapis.com failed: ----- Certificate i=1 (CN=Selenium Wire CA) ----- ERROR: No matching issuer found

[7972:17088:0309/193827.533:ERROR:cert_verify_proc_builtin.cc(677)] CertVerifyProcBuiltin for update.googleapis.com failed: ----- Certificate i=1 (CN=Selenium Wire CA) ----- ERROR: No matching issuer found

[7972:17088:0309/193834.051:ERROR:cert_verify_proc_builtin.cc(677)] CertVerifyProcBuiltin for service.ariba.com failed: ----- Certificate i=1 (CN=Selenium Wire CA) ----- ERROR: No matching issuer found

rfp_scraper.ipynb works fine. It's probably due to some invisible Jupyter Notebook shenanigans that's missing when running from a plain .py file (so annoying).

Whether this should be fixed or not is an open question. I'm reporting this for visibility since at least 1 other developer is using Windows. I believe the goal is to deploy the scraper to a linux machine, so it would probably be a better use of time for me to setup a Linux VM instead.

Arpanio commented 1 year ago

I could setup Ubuntu using Windows Subsystem for Linux, and get around the Selenium issues (mostly. Scraping still stops with no error messages but at least python-selenium is able to bind with Chrome).

Combined with VSCode and the wsl plugin, I think devs with Windows machines can get unblocked from contributing. We need not worry with maintaining the project for Windows, Mac, and Linux and only focus on Mac and Linux.

Listing down instructions here for reference. I'll work on updating the readme later:

  1. Install Windows Subsystem for Linux. I chose Ubuntu 22 as the target distro, but other devs can choose differently if they see fit.
  2. Get vscode. Other options may be viable, but this is best for my productivity.
  3. Get the wsl extension. This will allow us to remotely connect to the Ubuntu backend from vscode running on Windows. ctrl + ` will then open up terminal that's zsh on Ubuntu instead of Powershell on Windows.