mdw771 / adorym

Adorym: Automatic Differentiation-based Object Reconstruction with DynaMical Scattering
https://adorym.readthedocs.io/
25 stars 10 forks source link

Conversion of Adorym to support complex number datatypes #6

Closed tcpekin closed 1 year ago

tcpekin commented 1 year ago

This is a huge PR, better suited maybe for it's own branch for now, while other people test it out, in which the complex data type replaces the stacked arrays of real and imaginary parts of the various objects (probe, object, etc.).

It has been "tested" using Pytorch and the three linesearch algorithms (ADAM, CG, GD) on the cameraman dataset and the differences look to be within numerical errors.

It does not give the same answers with the autograd backend. I am half convinced to just remove the autograd backend, or switch it to JAX, but this is likely due to the gradient of complex numbers being computed differently in Pytorch and Autograd. To be honest... complex gradients confuse me so it is hard to have an authoritative idea of how to fix things.

I have not tested it with anything that isn't ptychography. I haven't also tested it with sparse or regular multislice ptychography.

I think a good goal for this package, if it is to continue being developed, would be to have some sort of testing framework. My tests are pretty ad hoc but they don't generalize to other people... it takes too much work to explain/reproduce, which is obviously not ideal.

I will still be working on this branch, so I don't think it should be merged just quite yet, but if anyone wants to take a look, here it is.

mdw771 commented 1 year ago

Thanks!

I have created another branch complex_dtype whose head is right before accepting the other of your PRs. If you could push this PR to the complex_dtype, we can keep it isolated from the main branch for now until it's ready to be merged.

An automatic nightly test pipeline is absolutely good to have. There were a lot of pains due to changes that unintentionally broke some test cases and remained unspotted for several weeks! I thought about setting up Travis CI before, which unfortunately didn't happen. I'm still hardly able to develop Adorym now as I'm too much occupied by my current job, but if you still have ongoing development, it's definitely a good idea to set up nightly test on some small datasets such as the cameraman. Travis CI seems to be no longer free, but apparently GitHub now offers its own free CI tool called GitHub Actions.

tcpekin commented 1 year ago

Ok, I wouldn't merge this quite yet. I am going to stop manually testing for now and work on setting up a testing pipeline... that will be new to me so it might take a bit. If you have any advice please share :).

One thing that has been bothering me is how the probe positions update - they often update in odd block patterns (i.e. all the positions of a batch move uniformly in one direction, independent of other batches). I'm not sure exactly what is going on there.

But anyways, I'll start setting up a testing pipeline on this branch and we can see how it goes.

mdw771 commented 1 year ago

There shouldn't be any built-in constraints that force the probe position updates to be the same for each minibatch. If this is observed during early iterations, it might be due to Adam, whose first and second moments are unreliable at the beginning of the optimization. In some cases steepest gradient descent actually does better than Adam. This is also a recognized issue in the deep learning community and warm-up iterations or improved optimizers such as RAdam (https://arxiv.org/abs/1908.03265) can be used to alleviate this problem.