Closed asolis closed 2 years ago
If the path forward is a separate file I believe common practice is a __version__.py
file that lives alongside the rest of the code. This file can then be imported or read from other places.
Potentially useful way of dealing with versions is to copy the sys
package and provide both a version_info
tuple and a version
string (just a concatenation of the tuple).
Ok, this turned out to be trickier than I thought. The main mismatch is that setuptools is configured to search for an arbitrary number of packages under src
, but Sphinx as configured assumes a single version for the entire repo. The open source projects I know of haven't been very helpful because they follow a 1 package per repo, no src dir model, while the UK cookiecutter doesn't use setuptools.
For scenarios where we could guarantee a 1-1 correspondence, https://stackoverflow.com/a/60430731 looks promising. In short, have packages source __version__
from the config instead of the other way around.
Ok, this turned out to be trickier than I thought. The main mismatch is that setuptools is configured to search for an arbitrary number of packages under
src
, but Sphinx as configured assumes a single version for the entire repo. The open source projects I know of haven't been very helpful because they follow a 1 package per repo, no src dir model, while the UK cookiecutter doesn't use setuptools.For scenarios where we could guarantee a 1-1 correspondence, https://stackoverflow.com/a/60430731 looks promising. In short, have packages source
__version__
from the config instead of the other way around.
This and PR #29 follows two out of the three solutions mentioned in stackoverflow issue pointed out here. it all comes down to if you want to provide a top level package or not. Top level package will assume python as programming language and I say that can be an initial step to just create one as starting point . Any other requirement could be just mentioned in doc as guidelines.
No top level package is a more general solución but an extra text file is the easiest solution. Just documenting how to extend it should be necessary.
No matter what you chose it won’t accommodate all the escenarios . For one hand , for hydra modules , I will create a namespace package structure. i will allow multiple projects to develop submodules of a top level package : hydra.modules.[project_name]. Or dsd.hydra.modules.[project_name] (still haven’t decided the top namespace) but in any case my “version” file will be pointing to: dsd.hydra.modules.[project_name].version. I don’t want you to support this in particular because knowing which will be the top level package or sub package that I want to version is not trivial and different for everyone.
I think it’s a good start if you asume a 1to1 and leave the developer to point it to a different module ? Or just use an extra file or metadata to setup the correct module version. For multiple submodules leave it to developer. I particularly will allow ppl to create multiple projects for each submodule of dsd.hydra.modules, making it one more time 1 to 1 .
I can elaborate more if it wasn’t clear .
__version__.py
reading the version from metadata stored in setup.cfg
, which itself can either choose to have it inline or read it from an external source like the VERSION
file in this PR.But that part is not terribly important to me. I think the bigger question is what the version selector in the rendered sphinx docs should show for each of the following scenarios:
dsd.foo
is 1.1 while dsd.bar
is 1.2.(auto-generated docs for R projects is its own can of worms that I'll leave off the table for now)
I'm not sure if Hydra falls under 2) or 3), but I presume we will have to support all of these workflows at some point. The path of least resistance is to use the short/abbreviated commit hash as I believe the UK cookiecutter does, but I imagine there are potential UX concerns there.
Hydra will be similar to point number 3, but the same project will not contain the multiple packages.
A project hydra-module-ocr will only contain the OCR sub-package: dsd.hydra.module.ocr
and will keep only one version dsd.hydra.module.ocr.__version__
. All code in this project will be under only one version. (The one stored in the submodule)
Another project hydra-module-tabular will contain another sub-package: dsd.hydra.module.tabular
, version file under dsd.hydra.module.tabular.__version__
. All code in this project will be under only one version. (The one stored in the submodule).
In my opinion, as a starting point template, you should do a project containing only one version for all the packages and sub-packages. Point 2.
So just to clarify, would you create a single docs site for dsd.hydra.module.ocr
and dsd.hydra.module.tabular
? Or are each of those getting their own Git(Hub|Lab) pages? If the former, would the version selector show dsd.hydra.module.ocr.__version__
, dsd.hydra.module.tabular.__version__
or neither?
Each of them will be getting their own Git(Hub|Lab) pages. Each module will be developed independently of the other. The version selection for each project will show dsd.hydra.module.ocr.__version__
and dsd.hydra.module.tabular.__version__
respectively.
Ok, sounds like an interesting monorepo layout. I'll be interested to see what shape it takes :)
Following a similar structure to this:
hydra-module-a/
setup.cfg
src/dsd/hydra/module/
subpackage_a/
__init__.py
hydra-module-b/
setup.cfg
src/dsd/hydra/module/
subpackage_b/
__init__.py
Each sub-package can now be separately installed, used, and versioned. I already have an initial setup for testing here: -https://gitlab.com/dsd4/hydra
*hydra-module-template will be a fork (now it’s just disconnected). Each of his children project will be connected with cruft to sync changes from upstream.
Btw, @goatsweater , cookiecutter has a release version 2.1+ (the only feature that I would like to use is private variables) but the latest version of cruft only supports cookiecutter > 1.6 and <2.0. I like cruft because of its simplicity but maybe another tool can provide better support to latest versions of cookiecutter.
Thinking about how the cloud native team built out their default CI pipeline they use git tags as a versioning mechanism. Not that we can't alter or even use a completely different template. My point is merely that so many things are set up to assume a single project per repo that I don't think we're introducing undue burden by defaulting to the same assumptions.
Another thing I hadn't thought of is that __version__.py
is Python specific, where a straight version text file could be used by R scripts as well (I think - haven't tested it). Using the method pointed out in the StackOverflow issue to have setup.cfg
read in the version file and other python code import the metadata at run time seems like the most flexible solution in my mind.
Came across some guidance on having a single top level package over multiple: https://peps.python.org/pep-0423/#use-a-single-name.
@ToucheSir I think this further supports that while setup.cfg
will support finding multiple packages in the current setup, that is a coincidence and we can cater to the single package route.
Instead of creating a top-level package containing an
__init__.py
file containing the package version we create aversion
text file at the project root containing the version number. The configuration filesetup.cfg`` and
Makefile``` have accessed this file to provide the version number from the file to sphinx and package building.I also added a few configurations that I believe should be set by default.
If this solution is approved, the other pull request #29 should be discarded. Just for reference #29 provides a similar solution but assumes that we will always have a top-level package folder initialized at creation.
@ToucheSir