INTI-CMNB / KiBot

KiCad automation utility
GNU Affero General Public License v3.0
574 stars 68 forks source link

Warning W048 when downloading datasheets (under Zscaller) #698

Closed lpdx closed 1 month ago

lpdx commented 1 month ago

What do you want to achieve? Download all of datasheets.

Do you have some PCB/Schematic to use as example? Just this function is not working in my setup.

> kibot -d _generated -e board.kicad_sch -c test.kibot.yaml
- 'Downloads the datasheets for the project' (download_datasheets_example) [download_datasheets]
WARNING:(W048) Failed with status 403 during download of `https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf` [C101] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf` [C102] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://www.yageo.com/upload/media/product/productsearch/datasheet/mlcc/UPY-GPHC_X7R_6.3V-to-250V_24.pdf` [C103] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf` [C104] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://www.we-online.com/components/products/datasheet/865080653015.pdf` [C105] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://www.we-online.com/components/products/datasheet/865080653015.pdf` [C106] (kibot - out_download_datasheets.py:64)
WARNING:(W048) Failed with status 403 during download of `https://www.we-online.com/components/products/datasheet/865080653015.pdf` [C107] (kibot - out_download_datasheets.py:64)

Do you have some configuration file (.kibot.yaml) that you are using? You can attach it or paste the content in the following section: test.kibot.yaml

# This file is useful to know all the available options.
kibot:
  version: 1
outputs:
  # Datasheets downloader:
  - name: 'download_datasheets_example'
    comment: 'Downloads the datasheets for the project'
    type: 'download_datasheets'
    dir: 'Example/download_datasheets_dir'
    options:
      # [boolean=false] Use the reference to classify the components in different sub-dirs.
      # In this way C7 will go into a Capacitors sub-dir, R3 into Resistors, etc
      classify: false
      # [string_dict={}] Extra reference associations used to classify the references.
      # They are pairs `Reference prefix` -> `Sub-dir`
      classify_extra: {}
      # [boolean=false] Include the DNF components
      dnf: false
      # [string|list(string)='_null'] Name of the filter to mark components as not fitted.
      # A short-cut to use for simple cases where a variant is an overkill
      dnf_filter: '_null'
      # [string='Datasheet'] Name of the field containing the URL
      field: 'Datasheet'
      # [boolean=true] Instead of download things we already downloaded use symlinks
      link_repeated: true
      # [string='${VALUE}.pdf'] Name used for the downloaded datasheet.
      # `${FIELD}` will be replaced by the FIELD content
      output: '${VALUE}.pdf'
      # [string|list(string)='_null'] Name of the filter to transform fields before applying other filters.
      # A short-cut to use for simple cases where a variant is an overkill
      pre_transform: '_null'
      # [boolean=false] Download URLs that we already downloaded.
      # It only makes sense if the `output` field makes their output different
      repeated: false
      # [string=''] Board variant to apply
      variant: ''

Environment (please complete the following information): Where are you running KiBot:


I think the source of my issue is because I'm under ZScaller service. Is there a way to configure the donwload to use the Linux CA certificates?

I didn't find none about warning 48 in the source code or documentation. Any help is appreciated.

lpdx commented 1 month ago

KiBot installation checker output:

Core:
Linux: 5.15.153.1 (Linux HOSTNAME 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux)
  /usr/bin/uname
Python: 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0]
  /usr/lib/python3.12/os.py
  /usr/bin/python3
KiCad: 7.0.11+dfsg-1build4
  /usr/lib/python3/dist-packages/pcbnew.py
  /usr/bin/kicad
Kibot: 1.8.1
  /usr/lib/python3/dist-packages/kibot/__main__.py
  /usr/bin/kibot

Modules:
Colorama: 0.4.6
  /usr/lib/python3/dist-packages/colorama/__init__.py
LXML: 5.2.1
  /usr/lib/python3/dist-packages/lxml/__init__.py
Lark: 1.1.9
  /usr/lib/python3/dist-packages/lark/__init__.py
PyYAML: 6.0.1
  /usr/lib/python3/dist-packages/yaml/__init__.py
QRCodeGen: Ok
  /usr/lib/python3/dist-packages/qrcodegen.py
Requests: 2.31.0
  /usr/lib/python3/dist-packages/requests/__init__.py
XLSXWriter: 3.1.9
  /usr/lib/python3/dist-packages/xlsxwriter/__init__.py
Xvfbwrapper: Ok
  /usr/lib/python3/dist-packages/xvfbwrapper.py
markdown2: 2.4.11
  /usr/lib/python3/dist-packages/markdown2.py
numpy: 1.26.4
  /usr/lib/python3/dist-packages/numpy/__init__.py

Tools:
Bash: 5.2.21 (GNU bash, version 5.2.21(1)-release (x86_64-pc-linux-gnu))
  /usr/bin/bash
Blender: 4.0.2 (Blender 4.0.2)
  /usr/bin/blender
Ghostscript: 10.2.1 (10.02.1)
  /usr/bin/gs
Git: 2.43.0 (git version 2.43.0)
  /usr/bin/git
ImageMagick: 6.9.12.98 (Version: ImageMagick 6.9.12-98 Q16 x86_64 18038 https://legacy.imagemagick.org)
  /usr/bin/convert
Interactive HTML BoM: 2.9.0 (v2.9.0)
  /usr/bin/python3
KiBoM: 1.9.1 (KiBOM Version: 1.9.1)
  /usr/bin/KiBOM_CLI.py
KiCad Automation tools (kiauto): 2.3.3 (pcbnew_do 2.3.3 - Copyright 2018-2024, INTI/Productize SPRL - License: Apache)
  /usr/bin/pcbnew_do
KiCad PCB/SCH Diff (kidiff): 2.5.5 (kicad-diff.py 2.5.5 - Copyright 2020-2024, INTI/Salvador E. Tropea - License:)
  /usr/bin/kicad-diff.py
KiCost: 1.1.18 (KiCost v1.1.18)
  /usr/bin/kicost
KiKit: 1.6.0.3 (kikit, version 1.6.0-3)
  /usr/bin/kikit
OpenSCAD: 2021.1.0 (OpenSCAD version 2021.01)
  /usr/bin/openscad
Pandoc: 3.1.3 (pandoc 3.1.3)
  /usr/bin/pandoc
RAR: 7.0.0 (RAR 7.00   Copyright (c) 1993-2024 Alexander Roshal   26 Feb 2024)
  /usr/bin/rar
RSVG tools: 2.58.0 (rsvg-convert version 2.58.0)
  /usr/bin/rsvg-convert
Xvfb: Ok (xvfb-run)
  /usr/bin/rsvg-convert

Environment:
DBUS_SESSION_BUS_ADDRESS
DISPLAY
GIT_TERMINAL_PROMPT
HOME
HOSTTYPE
INTERACTIVE_HTML_BOM_NO_DISPLAY
LANG
LESSCLOSE
LESSOPEN
LOGNAME
LS_COLORS
NAME
OLDPWD
PATH
PULSE_SERVER
PWD
SHELL
SHLVL
TERM
USER
WAYLAND_DISPLAY
WSL2_GUI_APPS_ENABLED
WSLENV
WSL_DISTRO_NAME
WSL_INTEROP
XDG_DATA_DIRS
XDG_RUNTIME_DIR
_

KiAuto:
This is KiAuto v2.3.3
Installed at: /usr/bin/pcbnew_do
Using kiauto module from: /usr/lib/python3/dist-packages/kiauto
Interpreted by Python: /usr/bin/python3 (v3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0])
Tools:
- kicad: /usr/bin/kicad (v7.0.11+dfsg-1build4)
- xdotool: /usr/bin/xdotool
- recordmydesktop: /usr/bin/recordmydesktop
- xsltproc: /usr/bin/xsltproc
- xclip: /usr/bin/xclip
- convert: /usr/bin/convert
set-soft commented 1 month ago

What happens when you run KiBot again to get the files that failed?

lpdx commented 1 month ago

I get the same results with all files failing.

set-soft commented 1 month ago

I tried the 3 URLs you mention without problems.

There is nothing special with W048, is just that we failed to get the file. The error 403 is forbidden, looks like some proxy is denying your requests. I don't know what Zscaler does. About the CA certificates: I guess you have them installed, the Python code will use them if needed. Try the wget or curl tools, i.e. wget https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf

lpdx commented 1 month ago

Using the wget I was able to download it. Here is the output:

wget https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
--2024-10-22 12:49:42--  https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
Resolving content.kemet.com (content.kemet.com)... 35.201.81.188
Connecting to content.kemet.com (content.kemet.com)|35.201.81.188|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1033508 (1009K) [application/pdf]
Saving to: ‘KEM_C1002_X7R_SMD.pdf’

KEM_C1002_X7R_SMD.pdf            100%[==========================================================>]   1009K  --.-KB/s    in 0.07s

2024-10-22 12:49:45 (14.9 MB/s) - ‘KEM_C1002_X7R_SMD.pdf’ saved [1033508/1033508]

Yes, I have the CA certificate installed on Linux. I'm afraid that the Python is not using it.

The Zscaler is a secure web gateway installed on Windows host. It handles all the web traffic for safety concerns.

set-soft commented 1 month ago

Yes, I have the CA certificate installed on Linux. I'm afraid that the Python is not using it.

I don't think so. I think this is completely from outside, nothing related to Python or KiBot.

The Zscaler is a secure web gateway installed on Windows host. It handles all the web traffic for safety concerns.

Can you temporally disable it to check if this is the source?

Also try this:

wget -U "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0" https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
set-soft commented 1 month ago

If the above fails, try with this one "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"

lpdx commented 1 month ago

Thank you for your reply. With the last option I was able to donwload it.

Here are the test outputs:

1) I was able to temporary disable the Zscaler and I have successfully downloaded the datasheets.

2) By runing the wget (with Zscaler enabled) I got error 403.

wget -U "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0" https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
--2024-10-22 13:04:53--  https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
Resolving content.kemet.com (content.kemet.com)... 35.201.81.188
Connecting to content.kemet.com (content.kemet.com)|35.201.81.188|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2024-10-22 13:04:54 ERROR 403: Forbidden.

3) With "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" I was able to donwload it!

wget -U "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
--2024-10-22 13:06:52--  https://content.kemet.com/datasheets/KEM_C1002_X7R_SMD.pdf
Resolving content.kemet.com (content.kemet.com)... 35.201.81.188
Connecting to content.kemet.com (content.kemet.com)|35.201.81.188|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1033508 (1009K) [application/pdf]
Saving to: ‘KEM_C1002_X7R_SMD.pdf.1’

KEM_C1002_X7R_SMD.pdf.1          100%[==========================================================>]   1009K  --.-KB/s    in 0.08s

2024-10-22 13:06:54 (12.6 MB/s) - ‘KEM_C1002_X7R_SMD.pdf.1’ saved [1033508/1033508]
lpdx commented 1 month ago

I solve locally by updating the USER_AGENT constant (line 342 of the file misc.py) with "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36".

Now the process runs smoothly.

I don't know the impact of this change for others users. Let me know if there is need of a pull request.

Thank you!

set-soft commented 1 month ago

I don't know the impact of this change for others users.

I took this one from a stats page, seems to the the most popular today.

Let me know if there is need of a pull request.

Don't worry, I'll commit the change to dev. Thanks for testing it.

lpdx commented 1 month ago

Ok, thank you!