Allow adjoint to be used as autograd backward

fzimmermann89 commented 1 week ago

Alternative to #307 using init_subclass as I suggested in #68

fzimmermann89 commented 1 week ago

@ckolbPTB

fzimmermann89 commented 1 week ago

I removed all typing information from init_subclass. Myp will most likely never be able to check inside the autograd wrapper anyways and if I am not mistaken,currently mypy will works as-if there was no adjoint wrapper. The forward, the adjoint and all calls of these will still be type checked.

github-actions[bot] commented 1 week ago

Coverage Report

File	Stmts	Miss	Cover	Missing
src/mrpro/algorithms/csm
inati.py	24	1	96%	44
iterative_walsh.py	15	1	93%	37
src/mrpro/algorithms/dcf
dcf_voronoi.py	53	4	92%	15, 48–49, 76
src/mrpro/algorithms/optimizers
adam.py	20	1	95%	69
src/mrpro/algorithms/reconstruction
DirectReconstruction.py	28	16	43%	51–71, 85
IterativeSENSEReconstruction.py	42	23	45%	77–78, 88–98, 113–124, 138–149
Reconstruction.py	51	24	53%	41, 53–55, 79–86, 103–114
src/mrpro/data
AcqInfo.py	128	2	98%	174, 214
CsmData.py	28	3	89%	14, 84–86
DcfData.py	44	8	82%	17, 65, 77–82
IData.py	67	9	87%	119, 125, 129, 159–167
IHeader.py	75	7	91%	75, 109, 127–131
KHeader.py	164	17	90%	24, 126–130, 157, 207, 218, 225–226, 229, 236, 275–286
KNoise.py	31	15	52%	39–52, 56–61
KTrajectory.py	69	5	93%	178–182
MoveDataMixin.py	126	14	89%	14, 109, 125, 139–141, 202, 265, 279, 358, 378–379, 396–397
QData.py	39	7	82%	42, 65–73
SpatialDimension.py	46	2	96%	64, 103
TrajectoryDescription.py	14	1	93%	23
acq_filters.py	10	1	90%	47
src/mrpro/data/_kdata
KData.py	105	16	85%	107–108, 117, 125, 179–180, 215, 220–221, 240–251
KDataRemoveOsMixin.py	29	2	93%	43, 45
KDataSelectMixin.py	20	2	90%	46, 62
KDataSplitMixin.py	48	3	94%	49, 79, 88
src/mrpro/data/traj_calculators
KTrajectoryCalculator.py	25	2	92%	23, 45
KTrajectoryIsmrmrd.py	13	2	85%	41, 50
KTrajectoryPulseq.py	29	1	97%	54
src/mrpro/operators
CartesianSamplingOp.py	50	9	82%	49–50, 55–56, 61–62, 88, 91, 114
ConstraintsOp.py	60	2	97%	46, 48
EndomorphOperator.py	51	2	96%	209, 213
FiniteDifferenceOp.py	27	2	93%	48, 113
FourierOp.py	77	1	99%	131
GridSamplingOp.py	123	9	93%	60–61, 70–71, 78–79, 82, 84, 86
LinearOperator.py	102	11	89%	34, 42–43, 47, 51, 75–79, 87, 186, 314
Operator.py	52	1	98%	21
SliceProjectionOp.py	166	8	95%	39, 46, 48, 54, 191, 212, 245, 285
WaveletOp.py	120	5	96%	152, 170, 205, 210, 233
ZeroPadOp.py	16	1	94%	30
src/mrpro/utils
Rotation.py	453	28	94%	58–66, 106, 283, 368, 370, 397, 452, 457, 460, 475, 492, 497, 640, 645, 648, 664, 668, 742, 744, 752–753, 993, 1075
filters.py	62	2	97%	44, 49
slice_profiles.py	45	6	87%	18, 34, 111–114, 147
sliding_window.py	34	1	97%	34
split_idx.py	10	2	80%	43, 47
summarize_tensorvalues.py	11	9	18%	20–29
zero_pad_or_crop.py	31	6	81%	26, 30, 54, 57, 60, 63
TOTAL	3638	294	92%

Tests	Skipped	Failures	Errors	Time
828	0 :zzz:	0 :x:	0 :fire:	1m 7s :stopwatch:

fzimmermann89 commented 1 week ago

There is a fundamental issue with using the adjoint as the backward:

During the forward inside the wrapper, autograd is disabled. Thus gradients from any parameters of self do not flow to the output. This means, looking at for example the gridsampleop, the gradient wrt x of the operator will work, the gradient wrt the grid will not work.

this is difficult to solve without using a custom solution for each linearop.

We might be able to use it for operators that do not depend on any other tensor. But most of our operators will then not be covered by the solution:

Fourier: Should be differentiable wrt trajectory
CSM wrt csms
gridsample wrt grid
- we need to disable it for all linear operator-magic operators, for the composition of linops to be differentiable to some attributes of one of the operators...

no idea how to pursue.

maybe we just do a custom autograd for the fourier operator?

@schuenke @ckolbPTB

ckolbPTB commented 1 week ago

There is a fundamental issue with using the adjoint as the backward:

That is a shame!

maybe we just do a custom autograd for the fourier operator?

Can we provide an custom gradient wrt to x but use pytorch autograd wrt to e.g. traj or would we have to provide our own autograd functionality for each parameter?

fzimmermann89 commented 1 week ago

Torchkbnuftt also does not work for traj requiring gradients afaik.

We could

fix torchkbnuftt (a started doing that, but there are many issues and it would require a substantial change - which would also speed up the nufft by a factor of 2..)
use finnuft (the torch bindings are not in pypi)
implementing a custom autograd function that works for traj and x, maybe supports different nufft backends and uses mri conventions.

As I started doing that work in the torchkbnuftt repo already and we can look up the equations in https://arxiv.org/abs/2111.02912 and the finnuft pytorch wrapper.

I will try to mock something up tonight or tomorrow morning

fzimmermann89 commented 4 days ago

Now this can be enabled for operators, using the adjoint_as_backward setting, i.e. class Op(LinearOperator, adjoint_as_backward=True). The default is false, it has to be enabled for each Operator that should use this.

PTB-MR / mrpro

Allow adjoint to be used as autograd backward #404