Massive WIP branch to collect several library improvement efforts. These will most likely involve a large and coordinated change to the structure of the codebase, so I'm grouping them together so that they land as a single PR rather than incremental and uncoordinated changes to main.
These changes are grouped thematically by their aim to improve the quality of MPoL as a PyTorch library
improvements to stability of core routines
aligning core usage patterns with established PyTorch idioms (e.g., prioritizing torch.tensor ahead of numpy.array , thinking about memory locations of arrays during optimization loops) should yield speed and stability improvements
documentation changes to be up front about how MPoL (at least the core package) is designed to function as an interferometric imaging library for PyTorch, which might lead to further discussions about what is in/out of scope for the core 'plumbing' package, and what might make more sense in visread (pure numpy visibility manipulations and plotting) or another MPoL-dev package (as 'plumbing', to use Git's terminology).
Updates
the proposed changes under consideration are now tracked by the "Architecture + Design" GitHub project board on MPoL-dev (available internally).
Milestone was closed as redundent.
In rough order of planned approach:
Coverage, bug-fix, and 'foundational' changes
Finish up type hinting started by #54, marking modules with 100% coverage with disallow_untyped_defs to prevent regressions.
Typing will likely reveal some inconsistent usage patterns for torch.tensor and np.array, which will beget further architectural redesign.
Identify routines that are under-tested and strengthen for the refactor.
Fix loss function bugs and inconsistencies
237
153
131
100
Removempol.utils.convert_baselines and mpol.utils.broadcast_and_convert_baselines, since this functionality now exists in visread (#227).
Rather than pre-package a mock ALMA logo dataset on Zenodo, create this in-package, on-the-fly, using NuFFT and some reference baselines.
Change base unit from klambda to lambda
223
also simultaneous changes to visread
Changes to introduce Stochastic Gradient Descent workflow
Add DdidSampler to codebase. Create a documentation page showing how to create a TensorDataset and a DataLoader #162
Documentation/"Getting Started" to explain how MPoL library usage follows the the normal ML paradigm with SGD (#188)
Further documentation changes
Reorganize SimpleNet and the entire GriddedDataset workflow as an alternative option to SGD, with documentation about the use cases when one or the other might be preferred.
Consolidate or remove redundant tutorials.
Think about what content makes sense as rendered docs (code demonstrating key library functionality, building blocks, preferably concise) or as a longer .py file in examples/ (actual workflows following official pytorch/examples. E.g.,
reduce scope of #75 to a "Quickstart/Getting Started" with links to documentation further explaining functionality where necessary
Direct longer content to examples or a new MPoL-dev package implementing 'plumbing' or tutorials.
134
25
Start examples directly from DSHARP datasets, removing need to download processed datasets from Zenodo & demonstrates required CASA work to user
Make sure all routines render in the API docs
Detailed migration/updated instructions from v0.2.0 => v0.2.1 in changelog, e.g. in the style of Julia Release Notes.
Though, if we manage to accomplish everything I've listed here a v0.3.0 makes more sense than a v0.2.1.
Massive WIP branch to collect several library improvement efforts. These will most likely involve a large and coordinated change to the structure of the codebase, so I'm grouping them together so that they land as a single PR rather than incremental and uncoordinated changes to
main
.These changes are grouped thematically by their aim to improve the quality of MPoL as a PyTorch library
torch.tensor
ahead ofnumpy.array
, thinking about memory locations of arrays during optimization loops) should yield speed and stability improvementsvisread
(pure numpy visibility manipulations and plotting) or anotherMPoL-dev
package (as 'plumbing', to use Git's terminology).Updates
In rough order of planned approach:
Coverage, bug-fix, and 'foundational' changes
disallow_untyped_defs
to prevent regressions.torch.tensor
andnp.array
, which will beget further architectural redesign.237
153
131
100
mpol.utils.convert_baselines
andmpol.utils.broadcast_and_convert_baselines
, since this functionality now exists invisread
(#227).223
Changes to introduce Stochastic Gradient Descent workflow
TensorDataset
and aDataLoader
#162Further documentation changes
.py
file inexamples/
(actual workflows following officialpytorch/examples
. E.g.,examples
or a newMPoL-dev
package implementing 'plumbing' or tutorials.134
25
Though, if we manage to accomplish everything I've listed here a v0.3.0 makes more sense than a v0.2.1.