Dev utility: Create script to generate requirements with hashes

pombredanne commented 4 years ago

Start with support for Python 3.6 on the 4 OS variants.: win32, win64, linux 64, mac Structure of freeze_and_update_reqs.py

Input -a directory (such as thirdparty) and credentials to GH API and a github repo/release to use
Function

It will create or update a requirement.txt with hashes from the dir
It will upload all the files in the dir to the Github release

Abhishek-Dev09-zz commented 4 years ago

@pombredanne Our typecode-libmagic's hashes doesn't matches with pypa's typecode-libmagic . So i have updated the wheel in my repo releases by taking from pypa. But you do not have uploaded typecode-libmagic for linux,window 32, 64 on pypa. So please upload it because you have only keys to them . pypa's link - https://pypi.org/project/typecode-libmagic/#modal-close .Some also hashes didn't match , see here http://dpaste.com/1ERXM6W .Should i replace wheel with pypa or change hashes of some wheel in requirement.txt ?

pombredanne commented 4 years ago

@Abhishek-Dev09 thats's minor for now. The simplest is to replace the wheel with PyPI... yet we cannot update PyPI and we can get a hash for a PyPI wheel that does not exists there like intbitset. So you likely need to also include the sdist in your hashes.

have you started a script to generate these?

Abhishek-Dev09-zz commented 4 years ago

No, i have created using pip-tools and hashin. @pombredanne

pombredanne commented 4 years ago

@Abhishek-Dev09 it has to be scripted and automated .

Abhishek-Dev09-zz commented 4 years ago

#
# Copyright (c) 2019 nexB Inc. and others. All rights reserved.
# http://nexb.com and https://github.com/nexB/scancode-toolkit/
# The ScanCode software is licensed under the Apache License version 2.0.
# Data generated with ScanCode require an acknowledgment.
# ScanCode is a trademark of nexB Inc.
#
# You may not use this software except in compliance with the License.
# You may obtain a copy of the License at: http://apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software distributed
# under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
# CONDITIONS OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
# When you publish or redistribute any data created with ScanCode or any ScanCode
# derivative work, you must accompany this data with the following acknowledgment:
#
#  Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
#  OR CONDITIONS OF ANY KIND, either express or implied. No content created from
#  ScanCode should be considered or used as legal advice. Consult an Attorney
#  for any legal advice.
#  ScanCode is a free software code scanning tool from nexB Inc. and others.
#  Visit https://github.com/nexB/scancode-toolkit/ for support and download.

from __future__ import absolute_import
from __future__ import print_function

from github import Github
import io
import os
import sys

# platform-specific file base names
sys_platform = str(sys.platform).lower()
on_win = False
if sys_platform.startswith('linux'):
    platform_names = ('posix', 'linux',)
elif 'win32' in sys_platform:
    platform_names = ('win',)
    on_win = True
elif 'darwin' in sys_platform:
    platform_names = ('posix', 'mac',)
elif 'freebsd' in sys_platform:
    platform_names = ('posix', 'freebsd',)
else:
    raise Exception('Unsupported OS/platform %r' % sys_platform)
    platform_names = tuple()

# Python versions
_sys_v0 = sys.version_info[0]
py3 = _sys_v0 == 3

base = ('base',)

requirement_filenames = tuple('requirements_' + py3 + p + '.whl' for p in platform_names + base)

g = Github("user", "password")

def generate_requirement(configs, root_dir, tpp_dirs, quiet=False):
    """
    Hash requirements from requirement files found in `configs` with pip,
    using the vendored components in `tpp_dirs`.
    """
    requirement_files = get_conf_files(configs, root_dir, requirement_filenames, quiet)
    requirements = []
    for req_file in requirement_files:
        req_loc = os.path.join(root_dir, req_file)
        requirements.extend(['--requirement', quote(req_loc)])
    run_pip(requirements, root_dir, tpp_dirs, quiet)

def build_pip_dirs_args(paths, root_dir, option='--extra-search-dir='):
    """
    Return an iterable of pip command line options for `option` of pip using a
    list of `paths` to directories.
    """
    for path in paths:
        if not os.path.isabs(path):
            path = os.path.join(root_dir, path)
        if os.path.exists(path):
            yield option + quote(path)

def hash_3pp(configs, root_dir, tpp_dirs, quiet=False):
    """
    Install requirements from requirement files found in `configs` with pip,
    using the vendored components in `tpp_dirs`.
    """
    requirement_files = get_conf_files(configs, root_dir, requirement_filenames, quiet)
    requirements = []
    for req_file in requirement_files:
        req_loc = os.path.join(root_dir, req_file)
        requirements.extend(['--requirement' , quote(req_loc)])
    run_pip(requirements, root_dir, tpp_dirs, quiet)

def run_pip(requirements, root_dir, tpp_dirs, quiet=False):
    """
    Install a list of `requirements` with pip,
    using the vendored components in `tpp_dirs`.
    """
    if not quiet:
        print("* Hashing components ...")
    if on_win:
        configured_python = quote(os.path.join(root_dir, 'Scripts', 'python.exe'))
        base_cmd = [configured_python, '-m', 'pip','hash']
    else:
        configured_pip = quote(os.path.join(root_dir, 'bin', 'pip'))
        base_cmd = [configured_pip]
    pcmd = base_cmd + [
        'hash',
        '--upgrade',
        '--no-index', '--no-cache-dir',
    ]
    pcmd.extend(build_pip_dirs_args(tpp_dirs, root_dir, '--find-links='))
    if quiet:
        pcmd += ['-qq']

    pcmd.extend(requirements)
    call(pcmd, root_dir)

def get_conf_files(config_dir_paths, root_dir, file_names=requirement_filenames, quiet=False):
    """
    Return a list of collected path-prefixed file paths matching names in a
    file_names tuple, based on config_dir_paths, root_dir and the types of
    file_names requested. Returned paths are posix paths.

    @config_dir_paths: Each config_dir_path is a relative from the project
    root to a config dir. This script should always be called from the project
    root dir.

    @root_dir: The project absolute root dir.

    @file_names: get requirements, python or shell files based on list of
    supported file names provided as a tuple of supported file_names.

    Scripts or requirements are optional and only used if presents. Unknown
    scripts or requirements file_names are ignored (but they could be used
    indirectly by known requirements with -r requirements inclusion, or
    scripts with python imports.)

    Since Python scripts are executed after requirements are installed they
    can import from any requirement-installed component such as Fabric.
    """
    # collect files for each requested dir path
    collected = []
    for config_dir_path in config_dir_paths:
        abs_config_dir_path = os.path.join(root_dir, config_dir_path)
        if not os.path.exists(abs_config_dir_path):
            if not quiet:
                print('Configuration directory %(config_dir_path)s '
                      'does not exists. Skipping.' % locals())
            continue
        # Support args like enterprise or enterprise/dev
        paths = config_dir_path.strip('/').replace('\\', '/').split('/')
        # a tuple of (relative path, location,)
        current = None
        for path in paths:
            if not current:
                current = (path, os.path.join(root_dir, path),)
            else:
                base_path, base_loc = current
                current = (os.path.join(base_path, path),
                           os.path.join(base_loc, path),)
            path, loc = current
            # we iterate on known filenames to ensure the defined precedence
            # is respected (posix over mac, linux), etc
            for n in file_names:
                for f in os.listdir(loc):
                    if f == n:
                        f_loc = os.path.join(loc, f)
                        if f_loc not in collected:
                            collected.append(f_loc)

    return collected

Another simple script

import os

for subdir, dirs, files in os.walk(r'C:\scancode-toolkit\thirdparty'):
    for filename in files:
        filepath = subdir + os.sep + filename

        if filepath.endswith(".whl") and filepath.find("py3") or  filepath.find("cp36"):
            subprocess.check_call([sys.executable, "pip", "hash", filepath])

Abhishek-Dev09-zz commented 4 years ago

@pombredanne : I made script something like this configure.py . Please review and guide me.

Abhishek-Dev09-zz commented 4 years ago

@pombredanne: one guy on #python recommends me to use boots.py which supports cross platform with all python version . He integrated with romp for pinning of wheel . https://github.com/altendky/pm is example project of boots and romp.

altendky commented 4 years ago

I mentioned that I use boots. I haven't developed it to any point of being a 'real' program. It might be a relevant reference to understand one approach to the problem and the pieces that go into that. It could certainly be expanded to include Python versions in its matrixing.

pombredanne commented 4 years ago

@Abhishek-Dev09 can you put that script in a PR?

Abhishek-Dev09-zz commented 4 years ago

@Abhishek-Dev09 can you put that script in a PR?

@pombredanne : That script is not correct otherwise i have already file an PR.

pombredanne commented 4 years ago

@altendky :wave: thank you for stopping by! how do you see boots and romp being used there?

pombredanne commented 4 years ago

@Abhishek-Dev09 re:

That script is not correct otherwise i have already file an PR.

Ah... you did not reference the ticket in your commit message title... hence I thought there was no PR ;) Can you provide the link and/or amend your commit message to reference this ticket?

Abhishek-Dev09-zz commented 4 years ago

@Abhishek-Dev09 re:

That script is not correct otherwise i have already file an PR.

Ah... you did not reference the ticket in your commit message title... hence I thought there was no PR ;) Can you provide the link and/or amend your commit message to reference this ticket?

@pombredanne I have no PR right now because i facing some difficulties in writing scripts that'why i commented in ticket rather that making PR , so that i can make clean PR.

altendky commented 4 years ago

The task brought up in #python was, as I recall, to create dependency lock files including hashes across multiple Python versions and multiple operating systems as well as to also collect those requirements into another repository.

The last piece strikes me as odd. Perhaps I'm just not used to working on projects that do this but they aren't really a thing I've seen. Perhaps, with Python, running your own devpi would be a reasonable way to address this aspect.

The first piece, dependency lock files with hashes, is perfectly normal and done by poetry (which also does more project and env management) and pip-tools (that focuses specifically on lock files and installing them into existing envs).

The middle part has two pieces. Locking for multiple Python versions is a thing that can theoretically only be done in some cases and in practice isn't specifically supported I don't think. There are PEP 508 platform markers but some packages still have if sys.version < (3,7): install_requires.append('something') which means you must run with multiple interpreters to observe this variation. Further, pip-tools at least only supports recording of direct dependency platform markers (https://github.com/jazzband/pip-tools/issues/563). I'm not sure how well poetry handles this.

The second middle part is the handling of multiple platforms. I am unaware of anything that does this beyond what is able to be done with platform markers. Well, other than what I did in boots/romp. Not saying I did something amazing, it's just a first effort at being able to run a script on one platform and get locked requirements for multiple platforms. In short, romp is setup with a custom configurable Azure Pipelines job and it submits for that to run including passing parameters about what program to run and what files to upload. This lets boots upload requirements.in files to each platform in the CI run, run pip-tools against them separately on each OS, and download the resulting requirements.txt files back to your local directory. Boots also handles 'groups' such as 'dev' and 'test', but that's an extra piece I think not yet requested for scancode-toolkit.

My setup with boots and romp is useful to me. It has been developed generally enough that i use it in multiple projects but not sufficiently that I tell people to use it. I share it as a reference for the only example I am aware of of any sort of multi-os locking. It may be that in the past few years packages have gotten better about their metadata and being able to acquire it even from OSes they don't support.

Anyways, tl;dr I am unaware of an existing solution for everything you want. I'd be happy to have boots support matrixing against multiple python versions (in addition to the existing OS/group matrixing it does now). I might even help do that. 'You' will have to decide if you think this is a good path forward with 'my personal/work project' (boots).

pombredanne commented 4 years ago

This is part of PR nexB/scancode-toolkit#2118

pombredanne commented 2 years ago

This has been completed and merged. The reference code lives in https://github.com/nexB/skeleton and is now used eventually in all the projects using the skeleton. Closing now!

pombredanne commented 2 years ago

Actually we do not create hashes yet. Reopening and moving to skeleton!

aboutcode-org / skeleton

Dev utility: Create script to generate requirements with hashes #50

Another simple script