lmmx / impscan

Command line tool to identify minimal imports list and repository sources by parsing package dependency trees
MIT License
0 stars 1 forks source link
dependency-graph package-management

impscan

Documentation CI Status Coverage Checked with mypy Code style: black

Command line tool to identify minimal imports list and repository sources by parsing package dependency trees

Scan imports in a directory, determine which are non-standard library, and then (tentatively) determine the package dependency tree and prune the requirements accordingly, as well as determining which can be obtained from Conda (and on which channels) and which from PyPI.

Unlike some other refactoring tools, impscan does not need to operate on a package (e.g. it can just be scripts)

Currently, requirements (AKA "root packages"), imported module name ("site packages" name) and other features are computed for one build for every package on conda's anaconda and conda-forge channels (over 20,000 packages).

System requirements

The detection of imported names relies on site-packages paths which Linux and macOS both have but Windows does not, so that functionality won't work on Windows. Feel free to open an issue to discuss developing this if interested.

Usage

usage: impscan [-h] [-q] [-e EXCLUDE] [-b] source_path

Scan imports and produce summary files for environment setup

positional arguments:
  source_path           Input path to scan Python files in

optional arguments:
  -h, --help            show this help message and exit
  -q, --quiet           Don't print to STDOUT
  -e EXCLUDE, --exclude EXCLUDE
                        Manually exclude a module name
  -b, --build           Produce dev build requirements (do not drop requirements marked
                        'build-system')

e.g.

impscan ./my_package_dir/
impscan ./one_module.py

and add -e foo to exclude the name "foo" from going into any requirements lists.

Output

Since many packages (e.g. numpy) have optimised builds available via conda, it is desirable to mix conda and PyPI packages in an environment, for which recommended best practice is to install PyPI packages afterwards.

Identifying this accurately is important to save time for software developers:

Two types of output are therefore required: