Update: The first round of changes around types is now complete (from #244) and ready to merge into main from this WIP-v0.3 branch. The changes were both substantial in terms of LOC but have relatively minor impact to the overall workflow, so it makes sense to merge them into main now, close some issues, and the progress to the next coherent batch on the road to v0.3.0.
Closes #54 Added type hints for core modules. This should improve stability of core routines and help users when writing code using MPoL in an IDE.
Closes #233. Removed convenience classmethods from_image_properties from across the code base. The recommended workflow is to create a :class:mpol.coordinates.GridCoords object and pass that to instantiate these objects as needed, rather than passing cell_size and npix separately. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:mpol.coordinates.GridCoords object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain, test, and document multiple entry points to key nn.modules. We removed from_image_properties from
:class:mpol.datasets.GriddedDataset
:class:mpol.datasets.Dartboard
:class:mpol.fourier.NuFFT
:class:mpol.fourier.NuFFTCached
:class:mpol.fourier.FourierCube
:class:mpol.gridding.GridderBase
:class:mpol.gridding.DataAverager
:class:mpol.gridding.DirtyImager
:class:mpol.images.BaseCube
:class:mpol.images.ImageCube
Closes #246 Make the passthrough behaviour of :class:mpol.images.ImageCube the default and removed this parameter entirely. Previously, it was possible to have :class:mpol.images.ImageCube act as a layer with nn.Parameters. This functionality has effectively been replaced since the introduction of :class:mpol.images.BaseCube which provides a more useful way to parameterize pixel values. If a one-to-one mapping (including negative pixels) from nn.Parameters to output tensor is desired, then one can specify pixel_mapping=lambda x : x when instantiating :class:mpol.images.BaseCube.
Closes #245 by requiring Python >= 3.8 for install, and only testing on 3.10 & 3.11 (torch not available on 3.12 yet).
Removed unused routine mpol.utils.log_stretch.
Made some progress converting docstrings from "Google" style format to "NumPy" style format. Ian is now convinced that NumPy style format is more readable for the type of docstrings we write in MPoL. We usually require long type definitions and long argument descriptions, and the extra indentation required for Google makes these very scrunched.
The remaining work will be re-raised in a new PR
Massive WIP branch to collect several library improvement efforts leading up to v0.3.0 release. These will most likely involve a large and coordinated change to the structure of the codebase, so I'm grouping them together so that they land as a single PR rather than incremental and uncoordinated changes to main.
These changes are grouped thematically by their aim to improve the quality of MPoL as a PyTorch library
improvements to stability of core routines
aligning core usage patterns with established PyTorch idioms (e.g., prioritizing torch.tensor ahead of numpy.array , thinking about memory locations of arrays during optimization loops) should yield speed and stability improvements
documentation changes to be up front about how MPoL (at least the core package) is designed to function as an interferometric imaging library for PyTorch, which might lead to further discussions about what is in/out of scope for the core 'plumbing' package, and what might make more sense in visread (pure numpy visibility manipulations and plotting) or another MPoL-dev package (as 'porcelain', to use Git's terminology).
The proposed changes under consideration are now tracked by the "Architecture + Design" GitHub project board on MPoL-dev (available internally). But here is a first assessment of planned approach:
Coverage, bug-fix, and 'foundational' changes
Finish up type hinting started by #54, marking modules with 100% coverage with disallow_untyped_defs to prevent regressions.
Typing will likely reveal some inconsistent usage patterns for torch.tensor and np.array, which will beget further architectural redesign.
Identify routines that are under-tested and strengthen for the refactor.
Fix loss function bugs and inconsistencies
237
153
131
100
Removempol.utils.convert_baselines and mpol.utils.broadcast_and_convert_baselines, since this functionality now exists in visread (#227).
Rather than pre-package a mock ALMA logo dataset on Zenodo, create this in-package, on-the-fly, using NuFFT and some reference baselines.
Change base unit from klambda to lambda
223
also simultaneous changes to visread
Changes to introduce Stochastic Gradient Descent workflow
Add DdidSampler to codebase. Create a documentation page showing how to create a TensorDataset and a DataLoader #162
Documentation/"Getting Started" to explain how MPoL library usage follows the the normal ML paradigm with SGD (#188)
Further documentation changes
Reorganize SimpleNet and the entire GriddedDataset workflow as an alternative option to SGD, with documentation about the use cases when one or the other might be preferred. Possibly remove SimpleNet in favor of a torch.nn.Sequential
Consolidate or remove redundant tutorials.
Think about what content makes sense as rendered docs (code demonstrating key library functionality, building blocks, preferably concise) or as a longer .py file in examples/ (actual workflows following official pytorch/examples. E.g.,
reduce scope of #75 to a "Quickstart/Getting Started" with links to documentation further explaining functionality where necessary
Direct longer content to examples or a new MPoL-dev package implementing 'plumbing' or tutorials.
134
25
Start examples directly from DSHARP datasets, removing need to download processed datasets from Zenodo & demonstrates required CASA work to user
Make sure all routines render in the API docs
Detailed migration/updated instructions from v0.2.0 => v0.3.0 in changelog, e.g. in the style of Julia Release Notes.
Note this PR supersedes #242 after we renamed the branch from v0.2.1 to v0.3.0.
Update: The first round of changes around types is now complete (from #244) and ready to merge into
main
from this WIP-v0.3 branch. The changes were both substantial in terms of LOC but have relatively minor impact to the overall workflow, so it makes sense to merge them intomain
now, close some issues, and the progress to the next coherent batch on the road to v0.3.0.from_image_properties
from across the code base. The recommended workflow is to create a :class:mpol.coordinates.GridCoords
object and pass that to instantiate these objects as needed, rather than passingcell_size
andnpix
separately. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:mpol.coordinates.GridCoords
object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain, test, and document multiple entry points to keynn.modules
. We removedfrom_image_properties
frommpol.datasets.GriddedDataset
mpol.datasets.Dartboard
mpol.fourier.NuFFT
mpol.fourier.NuFFTCached
mpol.fourier.FourierCube
mpol.gridding.GridderBase
mpol.gridding.DataAverager
mpol.gridding.DirtyImager
mpol.images.BaseCube
mpol.images.ImageCube
passthrough
behaviour of :class:mpol.images.ImageCube
the default and removed this parameter entirely. Previously, it was possible to have :class:mpol.images.ImageCube
act as a layer withnn.Parameter
s. This functionality has effectively been replaced since the introduction of :class:mpol.images.BaseCube
which provides a more useful way to parameterize pixel values. If a one-to-one mapping (including negative pixels) fromnn.Parameter
s to output tensor is desired, then one can specifypixel_mapping=lambda x : x
when instantiating :class:mpol.images.BaseCube
.mpol.utils.log_stretch
.The remaining work will be re-raised in a new PR
Massive WIP branch to collect several library improvement efforts leading up to v0.3.0 release. These will most likely involve a large and coordinated change to the structure of the codebase, so I'm grouping them together so that they land as a single PR rather than incremental and uncoordinated changes to
main
.These changes are grouped thematically by their aim to improve the quality of MPoL as a PyTorch library
torch.tensor
ahead ofnumpy.array
, thinking about memory locations of arrays during optimization loops) should yield speed and stability improvementsvisread
(pure numpy visibility manipulations and plotting) or anotherMPoL-dev
package (as 'porcelain', to use Git's terminology).The proposed changes under consideration are now tracked by the "Architecture + Design" GitHub project board on MPoL-dev (available internally). But here is a first assessment of planned approach:
Coverage, bug-fix, and 'foundational' changes
disallow_untyped_defs
to prevent regressions.torch.tensor
andnp.array
, which will beget further architectural redesign.237
153
131
100
mpol.utils.convert_baselines
andmpol.utils.broadcast_and_convert_baselines
, since this functionality now exists invisread
(#227).223
Changes to introduce Stochastic Gradient Descent workflow
TensorDataset
and aDataLoader
#162Further documentation changes
.py
file inexamples/
(actual workflows following officialpytorch/examples
. E.g.,examples
or a newMPoL-dev
package implementing 'plumbing' or tutorials.134
25
Note this PR supersedes #242 after we renamed the branch from v0.2.1 to v0.3.0.