d2esrdb / d2esrdb.github.io

1 stars 2 forks source link

Reorganization and modernization #37

Open allthestairs opened 5 days ago

allthestairs commented 5 days ago

I experienced a sudden desire to do some upgrades to this, mostly doing some things with the templates and data processing to integrate search and cross linking features, but I have just grown too accustomed to some modern Python features, particularly type hinting. While bored on a plane I did some reorganization and refactoring of this repo that made it easier to work with and thought I would offer some or all of these changes up. I've pushed a branch where I have made some relatively drastic changes to the repo, including:

  1. Refactored into an installable Python module with an installed script db-gen
  2. Reorganized things so that the data files and config lives in ./data and config in a YAML file and output goes into output, both locations configurable at runtime with CLI arguments. I found modifying the input directory in-place to also be the output directory to be messy and hard to reason about.
  3. Generated all pages (except the index) with the base.htm template by modifying the statically-included pages to be templated into the body of the base template facilitating simpler per-db link sets.
  4. Dramatically increased the speed by avoiding some polynomial-scaling repeats of string lower() calls and replaced other loop searches with dict key lookups.
  5. Added nearly-complete type hinting allowing for type checking the whole thing, fixing a couple small issues in the process.
  6. Added a very thorough set of linting rules to pyproject.toml using ruff, which this currently meets without doing anything too drastic, as well as consistently formatting with the ruff formatter.

This does add two more dependencies, click and ruamel.yaml. Click could be dropped if we use a built-in library for argument parsing and a yaml library could be dropped if we made the config file JSON or some other format parsable by the stdlib, though with the package installation these just get brought in automatically by pip.

Excepting line-endings, as far as I can tell this results in to-the-character identical output to the current main branch, including the automatic creation of index.htm file. I wanted to ensure everything was good and not broken before I went and started making changes to the actual generated website or processing of the data. Of course, since this now creates an output folder, deployment to Github pages would require changes. If you're interested in integrating some or all of this, let me know and I can work on integrating any feedback to make this more reasonable to merge. There are some other things I could add like a pre-commit setup to reproduce the formatting/linting for other people doing development.