pennsignals / dsdk

Data Science Deploy Kit
MIT License
8 stars 7 forks source link

Submodule naming conventions #61

Closed jlubken closed 2 years ago

jlubken commented 3 years ago

We already have a way to import dsdk components using the (sub-)module without ambiguity, and without duplicating the submodule names in the class names:``` from dsdk import mssql, postgres, Service

class CDISepsis(mssql.Mixin, postgres.Mixin, Service): pass```

The reason I typically use from dsdk.postgres import Mixin as PostgresMixin instead of importing submodules (or modules e.g. import dsdk.mssql, import re , import os.path) has to do with how module imports expose implementation details.

For, example, if I from os import environ inside dsdk/mssql.py, environ can be used as mssql.environ by importers of dsdk.mssql --unless shielded by a submodule directory and a dunderinit file--.

  1. People are familiar/comfortable with as used in import numpy as np, yet this form "gets a pass" even though it has overly wide scope. For client code, I don't care. Module code has a different bar.
  2. People are familiar/comfortable with nearly-bare imports import re even though this form is less specific and less deliberate than from dsdk.postgres import Mixin as PostgresMixin or from re import compile as re_compile.
  3. Submodules, orgs, and repos provide perfectly good namespaces already. as is designed to disambiguation collisions. If these namespaces aren't good enough then this becomes PennSignalsDSDKPostresMixin. Seriously! In the immunehealth repos "immunehealth" (or worse arbitrary abbreviations) must be specified at least three times: the org, included in the repo name, included in the class names. Why do we prefix penn on signals everywhere when it is part of the organization? pennsignals.uphs.upenn.edu It is right there in "upenn"!
  4. Open closed principle, and cohesion of the submodules would indicate that any aliasing needed by the module happens at the module level in dunderinit.

So if I was going to "fix" DSDK, I would add directories and dunderinit files that narrow the scope imported by from dsdk import mssql or import dsdk.mssql instead of duplicating the submodule namespace in the class names.

jlubken commented 2 years ago

Fixed.