microsoft / component-detection

Scans your project to determine what components you use
MIT License
438 stars 90 forks source link

Provide option for cleaning up files that PipReport produces #1244

Closed pauld-msft closed 1 month ago

pauld-msft commented 2 months ago

It appears that the pip install command that runs during PipReport will produce some artifacts that are not cleaned up at the end of the run. Some examples of these:

It would be valuable for some customers to have the option to remove all files that are produced, leaving the source as it was found.

the-raja commented 1 month ago

There are three main cleanup targets : .egg-info directories, .egg files, and pycache directories.

1.egg-info directories are typically created when a Python package is built or installed. These directories contain metadata about the installed package

  1. .egg Files: .egg files are an older Python distribution format, similar to .whl files, and they may exist in your project directory after running pip install.

  2. pycache Directories: Python creates pycache directories to store bytecode-compiled versions of Python files. These directories can be safely removed since Python can regenerate them.

Script for Clean Up :


import os
import shutil

def remove_directory(pattern):
    for root, dirs, files in os.walk("."):
        for dir_name in dirs:
            if dir_name.endswith(pattern):
                dir_path = os.path.join(root, dir_name)
                print(f"Removing directory: {dir_path}")
                shutil.rmtree(dir_path)

def remove_files(pattern):
    for root, dirs, files in os.walk("."):
        for file_name in files:
            if file_name.endswith(pattern):
                file_path = os.path.join(root, file_name)
                print(f"Removing file: {file_path}")
                os.remove(file_path)

remove_directory(".egg-info")
remove_directory("__pycache__")
remove_files(".egg")

BASH SCRIPT :


#!/bin/bash
find . -type d -name "*.egg-info" -exec rm -rf {} +

find . -type d -name "__pycache__" -exec rm -rf {} +

find . -type f -name "*.egg" -exec rm -f {} +

echo "Cleanup completed!"

Run these scripts after pip install, they will clean up all artifacts, leaving the source code in its original state.

pauld-msft commented 1 month ago

fixed with https://github.com/microsoft/component-detection/pull/1259