seddonym / import-linter

Import Linter allows you to define and enforce rules for the internal and external imports within your Python project.
https://import-linter.readthedocs.io/
BSD 2-Clause "Simplified" License
670 stars 45 forks source link

requesting design feedback #168

Closed sydneypemberton1986 closed 1 year ago

sydneypemberton1986 commented 1 year ago

Hey folks!

First off, thank you for writing such a wonderful tool!

I work for findhelp.com and we just had our 2023 hackathon, in which we worked on leveraging import-linter to create a Github Action to ensure that the modular structure of our architecture is not degraded by pull requests.

We ended up writing our own custom Contract, which was good enough for the hackathon to get a working proof of concept. But now we want to make a more complete offering and we wanted to get your feedback on the direction we are going, and if anything we are working on would be useful pull back into the import-linter itself.

Our codebase was designed to be a layered architecture but that was not enforced so it would not make sense to just hook the return code of pass or fail into our CI/CD pipeline. Instead we want to fail a build if the number of disallowed imports increases. So the first thing we did was go in and change the output to machine readable JSON so we could parse out the number of failures.

Furthermore, we are slowly working on splitting our architecture into "macro services". This means that a simple layered contract was insufficient for us, but instead we needed to implement a graph of allowed imports.

We also had a need to reduce various disjoint modules into conceptually unified modules. For example:

Finally, we wanted to be able to see the current state of the codebase, or possibly even produce an animation of our modular structure improving over the history of the project.

To solve these issues we are working on writing a Contract that looks a bit like this

import json
from collections import defaultdict
from pathlib import Path
from typing import Any, Dict, Iterable, List, Set

from grimp.adaptors.graph import ImportGraph
from importlinter import Contract, ContractCheck, fields, output

class ReducingGraphContract(Contract):

    """Checks the overall modular structure of the imports of a codebase.
    A Contract that reduces the import graph into a simplified graph reflecting
    the overall modular structure of a project and then presents the degree to which
    imports do not honor the modular structure.
    I would like to get this to a state where it takes as configuration input the following:
    - A [graph homomorphism][1] from the actual modules to the desired groupings
    in the form of a dictionary from the resulting simplified modules to
    the individual modules they combine.
         Example:
         {
             "front_end_libraries": [
                 "httplib",
                 "webapp2",
                 ...
             ],
             "referrals_persistence": [
                 "core.referrals.datastore",
                 "core.referrals.data_access",
             ],
             "referrals_business": [
                 "core.referrals.services",
             ],
             "persistence_libraries" : [
                 "sqlalchemy",
                 "google.appengine.ext.ndb",
                 ...
             ],
            ...
         }
     - A graph of allowed relations between the desired groupings.
    This could be represented as either an [adjacency matrix][2] or an [adjacency list][3].
    An adjacency list would be easier to reason about for most folks,
    and it would be more compact because this graph is bound to be sparse.
    You could represent this as a list of tuples, for example.
        [
            ("controllers", "business"),
            ("business", "persistence"),
            ("team_a.services", "team_b.services"),
            ...
        ]
    Alternatively, an adjacency matrix would make it easier to later add weights to edges.
    This would en able adding more sophisticated metrics.
    Imports are located in a layer and import from a layer
    You can set up an adjacency matrix A weighted with costs
    between each of controllers (c), services (s), and persistence(p).
    A cost of 0 indicates an allowed transition.
    Acc Acs Acp
    Asc Ass Asp
    Apc Aps App
        import numpy as np
        costs = np.array([
            [0, 0, 1],
            [2, 0, 0],
            [3, 2, 0]
        ])
    Then the output of the import linter could also be formatted as a matrix,
    making the calculation of total cost trivial

        counts = np.array([
            [1, 25, 1],
            [1, 20, 9],
            [0, 0, 12],
        ])
        total_cost = np.sum(np.multiply(costs, counts))

    Both of these graph representations shouldn't be too hard to convert to a d3.js representation, 
    which can produce some [pretty awesome graph displays][4].

    1: https://en.wikipedia.org/wiki/Graph_homomorphism
    2: https://mathworld.wolfram.com/AdjacencyMatrix.html
    3: https://en.wikipedia.org/wiki/Adjacency_list
    4: https://observablehq.com/@d3/force-directed-graph
    """
   ...

Furthermore, since the data we are going to be using to specify these inputs is complex, we are also playing with a JSON configuration file format, something like this:

commit f44d3e850dc4c0e2d0f0aec52b0625a614ed0221
Author: Sydney Pemberton <spemberton@findhelp.com>
Date:   Wed Mar 29 10:26:56 2023 -0500

    add a json file

diff --git src/importlinter/adapters/user_options.py src/importlinter/adapters/user_options.py
index 76fefa3..6fc9140 100644
--- src/importlinter/adapters/user_options.py
+++ src/importlinter/adapters/user_options.py
@@ -1,4 +1,5 @@
 import configparser
+import json
 from typing import Any, Dict, Optional, List
 import abc
 import sys
@@ -97,13 +98,29 @@ class TomlFileUserOptionReader(AbstractUserOptionReader):

         contracts = session_options.pop("contracts", [])

-        self._normalize_booleans(session_options)
+        _normalize_booleans(session_options)
         for contract in contracts:
-            self._normalize_booleans(contract)
+            _normalize_booleans(contract)

         return UserOptions(session_options=session_options, contracts_options=contracts)

-    def _normalize_booleans(self, data: dict) -> None:
-        for key, value in data.items():
-            if isinstance(value, bool):
-                data[key] = str(value)
+
+class JsonFileUserOptionReader(AbstractUserOptionReader):
+    potential_config_filenames = ["importlinter.json", ".importlinter.json"]
+
+    def _read_config_filename(self, config_filename: str) -> Optional[UserOptions]:
+        file_contents = settings.FILE_SYSTEM.read(config_filename)
+        data = json.loads(file_contents)
+        contracts = data.pop("contracts", [])
+
+        _normalize_booleans(data)
+        for contract in contracts:
+            _normalize_booleans(contract)
+
+        return UserOptions(session_options=data, contracts_options=contracts)
+
+
+def _normalize_booleans(data: dict) -> None:
+    for key, value in data.items():
+        if isinstance(value, bool):
+            data[key] = str(value)
diff --git src/importlinter/configuration.py src/importlinter/configuration.py
index b56198f..32e8cac 100644
--- src/importlinter/configuration.py
+++ src/importlinter/configuration.py
@@ -2,7 +2,7 @@ from .adapters.building import GraphBuilder
 from .adapters.filesystem import FileSystem
 from .adapters.printing import ClickPrinter
 from .adapters.timing import SystemClockTimer
-from .adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader
+from .adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader, JsonFileUserOptionReader
 from .application.app_config import settings

@@ -11,6 +11,7 @@ def configure():
         USER_OPTION_READERS={
             "ini": IniFileUserOptionReader(),
             "toml": TomlFileUserOptionReader(),
+            "json": JsonFileUserOptionReader(),
         },
         GRAPH_BUILDER=GraphBuilder(),
         PRINTER=ClickPrinter(),
diff --git tests/unit/adapters/test_user_options.py tests/unit/adapters/test_user_options.py
index a7f4ba9..064bdaf 100644
--- tests/unit/adapters/test_user_options.py
+++ tests/unit/adapters/test_user_options.py
@@ -1,6 +1,6 @@
 import pytest

-from importlinter.adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader
+from importlinter.adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader, JsonFileUserOptionReader
 from importlinter.application.app_config import settings
 from importlinter.application.user_options import UserOptions
 from tests.adapters.filesystem import FakeFileSystem
@@ -192,3 +192,69 @@ def test_toml_file_reader(contents, expected_options):

     options = TomlFileUserOptionReader().read_options()
     assert expected_options == options
+
+
+@pytest.mark.parametrize(
+    "contents, expected_options",
+    (
+        (
+            "{}",
+            UserOptions(session_options={}, contracts_options=[]),
+        ),
+        (
+            """
+            {
+                "foo": "hello",
+                "bar": 999
+            }
+            """,
+            UserOptions(session_options={"foo": "hello", "bar": 999}, contracts_options=[]),
+        ),
+        (
+            """
+            {
+                "foo": "hello",
+                "include_external_packages": true,
+                "contracts": [
+                    {
+                        "id": "contract-one",
+                        "name": "Contract One",
+                        "key": "value",
+                        "multiple_values": [
+                            "one",
+                            "two",
+                            "three",
+                            "foo.one -> foo.two"
+                        ]
+                    },
+                    {
+                        "name": "Contract Two",
+                        "baz": 3
+                    }
+                ]
+            }
+            """,
+            UserOptions(
+                session_options={"foo": "hello", "include_external_packages": "True"},
+                contracts_options=[
+                    {
+                        "name": "Contract One",
+                        "id": "contract-one",
+                        "key": "value",
+                        "multiple_values": ["one", "two", "three", "foo.one -> foo.two"]
+                    },
+                    {"name": "Contract Two", "baz": 3}
+                ],
+            ),
+        ),
+    ),
+)
+def test_json_file_reader(contents, expected_options):
+    settings.configure(
+        FILE_SYSTEM=FakeFileSystem(
+            content_map={"/path/to/folder/.importlinter.json": contents},
+            working_directory="/path/to/folder",
+        )
+    )
+    options = JsonFileUserOptionReader().read_options()
+    assert expected_options == options

What do y'all think? Are we on the right track? Is our approach sound? Is anything here useful to port to your project?

seddonym commented 1 year ago

Hi Sydney, thanks for your message! Sounds really interesting what you're doing.

My immediate reaction is that it's possible you could achieve your aims in a simpler way, but I might be missing something.

The fundamental requirement, which is to fail if the number of disallowed imports increases, is probably better done by having a passing contract that contains ignored imports. If you commit your contract to version control, then any PR that introduces a new layering violation will start failing (unless the commiter also adds an ignored import). This makes it easy to see whether or not things are getting worse.

If you want to make further tooling around this, there is actually a Python API for reading the contracts. You can use this for, say, recording a metric of the number of ignored imports in your contract - or even a custom Github action that prevents merge if the number increases.

Does that help?

sydneypemberton1986 commented 1 year ago

Thanks. I'll close this since it's not really actionable and I responded in my PR..