drivendataorg / cookiecutter-data-science

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
https://cookiecutter-data-science.drivendata.org/
MIT License
7.99k stars 2.41k forks source link

Defend against broken paths from non-editable installs #356

Open jayqi opened 3 months ago

jayqi commented 3 months ago

The refreshed Python module boilerplate (#354) adds a {module_name}.config module that contains pathlib.Path instances to various project directories. These paths depend on the project module being installed as editable in order to resolve correctly.

Someone following the golden path while using a CCDS-generated project should end up with an editable installation. However, it's possible for someone to not do so. Some possible mitigations:

Some prototype code for achieving this that was ultimately removed from #354


# Check if the package is installed as editable
def _is_editable():
    # https://peps.python.org/pep-0660/#frontend-requirements
    try:
        dist = importlib.metadata.distribution("{{ cookiecutter.module_name }}")
        direct_url_data = dist.read_text("direct_url.json")
        if direct_url_data is None:
            return False
        return json.loads(direct_url_data).get("dir_info", {}).get("editable", False)
    except importlib.metadata.PackageNotFoundError:
        return False

IS_EDITABLE = _is_editable()

# Determine PROJ_ROOT path
if os.getenv("PROJ_ROOT"):
    logger.debug("Reading PROJ_ROOT from environment variable.")
    PROJ_ROOT = Path(os.getenv("PROJ_ROOT"))
elif IS_EDITABLE:
    logger.debug("Setting PROJ_ROOT relative to editable package.")
    PROJ_ROOT = Path(__file__).resolve().parents[1]
else:
    logger.debug("Using current working directory as PROJ_ROOT.")
    PROJ_ROOT = Path.cwd()

Some more discussion here: https://github.com/drivendata/cookiecutter-data-science/pull/354#discussion_r1556465660