Gene-Weaver / LeafMachine2

LeafMachine2 is a modular suite of computer vision and machine learning algorithms that enables efficient identification, location, and measurement of vegetative, reproductive, and archival components from digital plant datasets.
https://leafmachine.org/
GNU General Public License v3.0
26 stars 6 forks source link

[SSL: Certificate_Verify_Failed] on run of `test.py` resulting in missing ML files #8

Closed mpitblado closed 7 months ago

mpitblado commented 7 months ago

This is likely something to do with my specific setup and will begin to troubleshoot, however I thought I would create an issue in case any other users experience the same error. I have checked the url that is being requested with curl and the certificate is loading fine, so perhaps something about the python virtual environment needs to be configured.

Error message presented

ERROR --- Could not download or extract machine learning models <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

Potentially relevant code

./leafmachine/machine/fetch_data.py

def get_weights(dir_home, current, logger):

    try:
        path_zip = os.path.join(dir_home,'bin',current)
        zipurl = ''.join(['https://leafmachine.org/LM2/', current,'.zip'])
        headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}

        req = urllib.request.Request(url=zipurl, headers=headers)

        # Get the file size from the Content-Length header
        with urllib.request.urlopen(req) as url_response:
            file_size = int(url_response.headers['Content-Length'])

        # Download the ZIP file from the URL with progress bar
        with tqdm(unit='B', unit_scale=True, unit_divisor=1024, total=file_size) as pbar:
            with urllib.request.urlopen(req) as url_response:
                with open(current + '.zip', 'wb') as file:
                    while True:
                        chunk = url_response.read(4096)
                        if not chunk:
                            break
                        file.write(chunk)
                        pbar.update(len(chunk))

        # Extract the contents of the ZIP file to the current directory
        zipfilename = current + '.zip'
        with ZipFile(zipfilename, 'r') as zip_file:
            zip_file.extractall(os.path.join(dir_home,'bin'))

        print(f"{bcolors.CGREENBG2}Data extracted to {path_zip}{bcolors.ENDC}")
        logger.warning(f"Data extracted to {path_zip}")

        return path_zip
    except Exception as e:
        print(f"{bcolors.CREDBG2}ERROR --- Could not download or extract machine learning models\n{e}{bcolors.ENDC}")
        logger.warning(f"ERROR --- Could not download or extract machine learning models")
        logger.warning(f"ERROR --- {e}")
        return None

Workaround

I believe that manually downloading the files from https://leafmachine.org/LM2/ and placing them unzipped in the directory that the code is looking in should solve this issue, as if ver['version'] == VERSION: should be true.

mpitblado commented 7 months ago

Fix

The fix for this issue is to run the following install certificates command from within the virtual enviroment. This can be done by running the below in the root directory of the repo, while the virtual environment is activated. The below was found at https://gist.github.com/marschhuynh/31c9375fc34a3e20c2d3b9eb8131d8f3

# install_certifi.py
#
# sample script to install or update a set of default Root Certificates
# for the ssl module.  Uses the certificates provided by the certifi package:
#       https://pypi.python.org/pypi/certifi

import os
import os.path
import ssl
import stat
import subprocess
import sys

STAT_0o775 = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR
             | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP
             | stat.S_IROTH |                stat.S_IXOTH )

def main():
    openssl_dir, openssl_cafile = os.path.split(
        ssl.get_default_verify_paths().openssl_cafile)

    print(" -- pip install --upgrade certifi")
    subprocess.check_call([sys.executable,
        "-E", "-s", "-m", "pip", "install", "--upgrade", "certifi"])

    import certifi

    # change working directory to the default SSL directory
    os.chdir(openssl_dir)
    relpath_to_certifi_cafile = os.path.relpath(certifi.where())
    print(" -- removing any existing file or link")
    try:
        os.remove(openssl_cafile)
    except FileNotFoundError:
        pass
    print(" -- creating symlink to certifi certificate bundle")
    os.symlink(relpath_to_certifi_cafile, openssl_cafile)
    print(" -- setting permissions")
    os.chmod(openssl_cafile, STAT_0o775)
    print(" -- update complete")

if __name__ == '__main__':
    main()

Normally this file gets bundled with the manual installation of python from python.org, but in the case of creating from a virtual environment it doesn't seem to get copied over in some instances.

mpitblado commented 7 months ago

I forgot that I had also changed a few things in the code to get this to work. I will test if just running the above script is sufficient to fix the issue, or if the changes made were necessary. If the changes were necessary for the fix, I will draft a PR for review

mpitblado commented 7 months ago

Code changes do not seem to have been required beyond running the install certificates script. Closing!