Question: Field-varying PSF?

AstarVienna / ScopeSim

A telescope observation simulator for Python.

GNU General Public License v3.0

13 stars 10 forks source link

Question: Field-varying PSF? #193

Open joao-aveiro opened 1 year ago

joao-aveiro commented 1 year ago

For simulating telescopes such as the ELT, it is important to consider variations of the PSF across the field. I believe the purpose of the FieldVaryingPSF class in the psfs.pymodule is precisely this, but it seems that it is not up-to-date or that the development stalled in this regard. More so, in irdb there are no uses of this class (apart from some commented configurations) and no example FITS is provided so one can build a valid input for this class. As such, I would like to ask a few questions:

What is the development status of this class and, overall, of using field-varying PSFs?
The documentation (namely docstrings) are lacking. Could the contributors who implemented this class kindly complete this and provide additional information that might be useful for the end-user?
Is there any example FITS containing field-varying PSFs that one can use as an example to create custom cases?

hugobuddel commented 1 year ago

Hi @joao-aveiro, we actually (briefly) discussed the field-varying PSF today! @astronomyk is the author of the FieldVaryingPSF effect, but let me give my thoughts.

We recently added code coverage to our tests, and the FieldVaryingPSF is mostly covered. Your best starting point is probably test_FVPSF.py.

It uses test_FVPSF.fits as input, and although I'm only superficially familiar with the effect, it seems it could perhaps be the example you are looking for:

>>> from astropy.io import fits
>>> hdus = fits.open("./scopesim/tests/mocks/MICADO_SCAO_WIDE/test_FVPSF.fits")
>>> len(hdus)
3
>>> hdus[0].data
>>> hdus[1].shape
(90, 90)
>>> hdus[1].data
array([[0., 0., 0., ..., 6., 6., 6.],
       [0., 0., 0., ..., 6., 6., 6.],
       [0., 0., 0., ..., 6., 6., 6.],
       ...,
       [2., 2., 2., ..., 8., 8., 8.],
       [2., 2., 2., ..., 8., 8., 8.],
       [2., 2., 2., ..., 8., 8., 8.]])
>>> hdus[2].data.shape
(9, 5, 5)
>>> hdus[2].data[3]
array([[0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0.]])
>>> hdus[2].data[4]
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

There are three HDUs:

The primary HDU without data.
An HDU with a 2D array that explains which location on the field should use which PSF.
An HDU with a 3D cube where each layer is a PSF. The cells in the 2D array in HDU 1 are (I think) indices to the layer in this cube.

Now I don't know how this is used in practice. E.g., how the array in HDU 1 is used. There is also some information in the HDU headers that is probably relevant. I hope you are able to figure things out from here.

I'll use this issue to ensure we improve the documentation, as you request in your second question. Maybe it would be good to expand the mock instrument that is used for the tests to also use it as an example instrument for new users. Or maybe we could add another notebook demonstrating the field varying PSF. In addition to improving the docstrings that is.

What we discussed today is that the FieldVaryingPSF is too simple. It simply selects a specific PSF to use at a specific location, resulting in discrete jumps in which PSF is used. It would probably be better to perform some kind of interpolation between the PSFs that are specified in the PSF file. That is for later though. A workaround would be to create a huge set of PSFs; that is, do the interpolation when creating the FITS file with PSFs.

Hope this helps.

joao-aveiro commented 1 year ago

Thank you for the detailed explanation and for pointing me to these tests.

I have concluded that this FieldVaryingPSF class probably won't be adequate for me, due to the discrete "borders" that might appear in the transition between PSFs, as you stated.

I will try making a custom class that uses interpolation (just bi-linear, for now) to avoid this effect. If it gets mature enough, would it be interesting to merge this into the main branch of ScopeSim? If so, is there anyone on your team that could help? I know this is probably not at the top of your priority list, but I believe with your help this could be done quite quickly. I am currently testing some different approaches to perform and optimize the interpolation; in the meantime, it would be extremely helpful if someone could help with generating some test cases for validation or even structuring the class - since someone who is familiar with the structure and working principles of ScopeSim would be much more comfortable in implementing its logic (e.g. handling the different wavelength layers, managing the headers, etc.).

hugobuddel commented 1 year ago

A FieldVaryingPSF that uses interpolation would be appreciated by the rest of the community. I can probably find some time to help you with integrating it into ScopeSim. I don't yet have much experience with the PSF-part of ScopeSim; this seems a good reason to learn more, and otherwise @astronomyk can help.

Your experiments about how to best interpolate the PSFs would be valuable, since you are the best judge of what works for your end goals. Bilinear seems to be the safest, because it would prevent problems like negative fluxes.

One other thing that I'd like to tackle is that, from what I've understood, the current way PSFs are applied to point sources is not precise enough to derive sub-pixel astrometric solutions, but that probably goes beyond this specific issue.

joao-aveiro commented 1 year ago

I have been quite busy these past weeks, but I have finally put together some basic algorithms to perform the proposed task. Please check this commit on my personal fork, and see the new, preliminary (i.e. broken) GridFieldVaryingPSF class.

What's done?

Basically, I implemented a linear interpolation algorithm that, given a MxNxIxJ grid of PSFs (a grid of MxN PSFs across the field, each with size IxJ), it provides the interpolated PSF at any pixel position. Additionally, I implemented a custom convolution such that one can perform the interpolation and convolution sequentially to avoid performing all the interpolations beforehand (and risking running out of memory or storage). Thus, the field varying PSF convolution consists of iterating over every pixel and, for each, computing the PSF at that position and convolving.

My implementation might look very crude, and you'll see that I don't use most of the familiar Numpy and SciPy routines, but this is done purposely to allow for Numba compilation and parallelization.

What's missing?

This is just a sketch and it is by no means working. The following still remains to be defined or implemented:

Define the structure of the input PSF grid;
Read PSF grid coordinates and perform correct positioning of the PSF w.r.t. the detector;
Normalize the PSFs (if needed);
Expand for multiple wavelengths;
Correct the header (?);
(Optional, for now) Implement nearest-neighbour interpolation);
Validate.

For now, I'll look into the positioning of the PSF and the projection between real PSF coordinates and detector coordinates, as well as the normalization of the various PSFs in the grid. Help would be appreciated in implementing the functionality to work with multiple wavelengths, reading the PSF grid from a file, and implementing what's missing in the class initialization so it works with ScopeSim. Also, when this is somewhat more mature, it would be important to create some tests for validation.

Can anyone help me with this?

hugobuddel commented 1 year ago

Really nice @joao-aveiro ! Your code is well written and easy to understand. I don't recall ever actually having worked with numba, so this is a good reason to start!

The main thing needed to make this work, seems to be this projection between coordinates you mention, is that right? Do you have an idea on how to approach that?

One thing that came to mind is that it would be best if we would try to decouple the structure of the psf_grid from the detector details. That is, that changing the detectors should preferably not require a change in the psf_grid. Related, the apply_to function works on Field of Views, which do not necessarily correspond to a detector. (In practice, the DetectorArray is usually the biggest factor in determining the FoVs, but Effects in other OpticalElements can influence them too.) So maybe it would be best if the MxN grid would be in some kind of physical units?

By the way, I was thinking that it could also be useful to define the PSF's in some other way than in a grid. For example, maybe it could be useful to take a real observation, measure the PSF of some isolated stars on the image, and then use those as input to the FieldVaryingPSF. That would probably be easier with some data structure like a dictionary where the keys are a coordinate-tuple (in some not-yet-defined units) and as values these IxJ shaped PSFs. Then the interpolator could find three points that surround a given point and then interpolate within that triangle. That should perhaps be a separate class though, because well, yours even has Grid in the name!

The other things you mention seem relatively minor (perhaps still essential though). It seems that I could assist with those. But it will take a bit of time, because we have the METIS consortium meeting in a few weeks, which currently takes up most of our time.

Validation: that would be pretty good to have yes. Besides some unit tests, it would also be good to have some integration tests. Currently we use jupyter notebooks in the IRDB for documentation, and I kinda hijacked that as a poor way of doing integration tests..
Multiple wavelengths: Would you see that as adding a dimension to the MxN grid, or to the the IxJ arrays (making them cubes)?
Normalization, that should not be necessary if the PSFs are normalized to start with, right?
Correct the header: this would not be necessary in your case, because you remove the padding (if I understand correctly), while the padding is kept in the other PSF classes (I think). Keeping the padding would make sense if there are multiple FoVs that are tightly 'packed' together, where a single detector spans multiple FoVs. Then the padding would make the FoVs overlap slightly, and then those FoVs are stacked into an image plane and then flux is conserved.
Nearest-neighbour interpolation: we could add that when we need it I think.

I think the best way for me to start is to write some simple validation and integration tests for the PSFs in general. Those could then provide some sanity checks on flux conservation, normalization, etc. for all PSF classes. Perhaps catch some off-by-one errors and so on.

It seems we (well, currently you) are on the right track! Thanks for getting this ball rolling!

joao-aveiro commented 1 year ago

@hugobuddel Thank you for your kind words and all the feedback. I'll try to answer some of your questions/remarks.

Regarding the projection, I probably mentioned projecting onto the detector frame, but what I meant was to project onto the FOV (my bad). I have to check again how this is achieved in the other classes, but I was thinking about the input FITS containing the PSF grid having the pixel coordinates of the central pixel, the position of that centre pixel in the sky (e.g. mas from the centre of the FOV - does this make sense?), and the spacing between PSFs in the grid (e.g. also in mas). I believe we have the same similar information about the FOV, so calculating the relative coordinates for interpolation would be straightforward, I think. Does this make sense? The only disadvantage of this is that it assumes that we have a regular (i.e. equally spaced) grid of PSFs, which might not always be very practical. Nonetheless, I believe that is the exact purpose of the proposed class, so other functionality (e.g. working with non-regular grids) might be envisioned in the future, but separate from this.

Related to this, with respect to your suggestion about allowing an interpolation using arbitrarily positioned PSFs (or sources), I believe it would be interesting and useful, but it will not be as straightforward as this simple class I have made. My argument is that, whilst using a regular grid and applying bi-linear interpolation can be extremely optimized, using arbitrary positions for the interpolation requires more complex algorithms that, albeit being possible to optimize, they are not as performant or straightforward. My self-imposed requirements for this grid class were that it could run in a reasonable amount of time (a few hours on a capable server?) with grids containing hundreds of PSFs of 1024x1024 px and images of 2048x2048 px. I have yet to confirm this goal has been achieved, but I have made my best effort in that regard. I think that for what you are proposing we'll have to accept that it will only work for much simpler cases, but, with that, we probably won't have to rely on such an "obscure" implementation as this one for optimization purposes, and possibly make use of existing methods (scipy.interpolate.LinearNDInterpolator ?). We can delve further into this after having a working example of the simple bi-linear interpolation working, do you agree?

joao-aveiro commented 1 year ago

As for the remaining points:

Validation: After deciding on a "structure" for the input FITS, I believe the ideal would be to generate some simple but known cases where we can analytically calculate the result and compare it with the one provided by the class. Ultimately, I believe these should be included in ScopeSim's test pipeline, but for now, I believe what is most urgent is to validate that what we are doing is correct, so even running a few test scripts by hand would suffice. Then, eventually, we can integrate these into the pipeline.
Multiple wavelengths: To be honest, I haven't really looked in depth at how this is handled in the other PSF classes. From what I've seen (and correct me If I'm wrong), the input FITS files provide multiple ImageHDU, one for each wavelength, and the PSF class selects the closest PSF/ImageHDU (in terms of wavelength) for each FOV. If this is the case, there probably isn't much to do, and the FITS file will be similar to the other cases, but instead of each ImageHDU being a 2D PSF, it is a cube (actually, a 4D tensor) containing the PSF grid for that wavelength. I'll look into this after taking care of the coordinate conversion/projection and such, but if someone from your team could attest that this is in fact how it works and would specify the structure of input FITS for this new class so that it is the most consistent across other PSF FITS files, it would be extremely helpful.
Normalization: Well, yes, I do use normalized PSFs, so this is not so important for me. But since the other PSF classes include an optional normalization method, I believe we should be consistent and provide one here too, no? I believe it is straightforward to do, so no problem there, but still I will run some tests to check if this can be done efficiently for a large PSF grid and that the RAM from common laptops doesn't burst into flames while loading, let's say, a 100 x 100 grid of PSFs each with 2k x 2k px.
Correct the header: To be honest, for now, I can't comment much on what you've written, since I haven't really understood what all this header manipulation is trying to achieve. I'll have to look at it a bit more and try to draw my conclusions. Nonetheless, I am a bit averse to the use of FITS files (or their structure) in the classes - as it is often done -, so I'll try to advance without relying on them (and just using normal Python/Numpy data structures) and, in the end, I'll create the required methods for interfacing with (i.e. ingesting and manipulating) the associated FITS files.

My apologies if I've rambled a bit too much or if my brainstorming is not clear enough, but I hope that you found it useful nonetheless. I'll keep you posted about my progress, and I hope to hear from you in case you want/can join me in any of these tasks.

hugobuddel commented 1 year ago

You seem to have a coherent plan. Some thoughts.

Projection: Your suggestion to do everything w.r.t. to the center of the observation in mas seems great. I'm not sure how other classes deal with that, but I believe in the same way. Note that the center of the FoV in pixels is usually, but not necessarily, the center of the observation, see below. [It could be that there are currently no cases where the FoVs are not centered around the observation though, but it is my understanding that they don't have to be centered.] Related, at some point there should be a bit more pointing-capabilities in ScopeSim; so it becomes easier to do things like simulate dithered observations. That shouldn't necessarily cause any problems though, I'm just mentioning it because at that point we probably will need to think this through again for all classes.

Wavelengths: I think you're right about the wavelengths. Perhaps it would be easiest if you first focus on getting the the single-wavelength version working as expected, then I can verify that the other classes deal with it as you discussed; as I want to understand this better anyway. Things would perhaps become more complicated if you would also want to interpolate in the PSF axis.

FoVs: Another thought, Effects can also split a FoV on wavelength (if I understand correctly), so maybe this effect could do that too. I haven't yet learned how this works exactly though.

Non-grids: Let's indeed ignore this thought about non-grids for now.

Performance: The main goal of ScopeSim is to create simulations within minutes (or seconds). It is fine to have also have effects that take longer, especially when there are similar effects that are fast (but of lesser quality). So this will be fine.

Correct the header: Just like Effects can split the FoV in wavelength ranges, they can also split the FoV by area. Usually you'd have one FoV that covers the full detector. However, it could also be that there are multiple FoVs, that together fill the detector. Consider the edge of two FoVs that the detector both 'sees'. Then if you'd have a point source just on the side of that edge and apply the PSF, then it would leak flux beyond the border which would be lost. This flux is kept by padding the FoV (making it overlap with the one next to it). Making the FoV larger, requires recalculating the projection used by that FoV; in particular, which pixel is the 'central one'.

FITS headers: Using FITS header structure internally is mostly for convenience (since the data ends up in FITS files). The important point is that the projection information should be stored somehow, and the way this is done in FITS files is essentially an industry standard. (I'm sure ScopeSim could improve on how it deals with projections though.)

My apologies that I can often provide less insight than I would like for many of the things you encounter, because I'm still learning many details myself. So far I've been mainly involved in other aspects of ScopeSim. I hope that I'm not telling you wrong things, that would be even worse than not saying anything.

joao-aveiro commented 1 year ago

Hey @hugobuddel, here is some news on this topic.

I have implemented most of the missing blocks for this to work correctly. Please check my fork and tell me your thoughts. I have provided extensive docstrings and comments for the new classes and functions, so I hope it is clear. I have also included some very simple tests/examples (which are not meant to be merged into ScopeSim) so you can try this.

Some notes:

I have yet to implement this for more than one wavelength;
I haven't looked into what is required to be done for multiple sensors/FOVs;
There are several TODOthroughout; these are mostly things I don't know, so it would be helpful if you or someone on your team could provide some info about them or ideally finish the missing parts;
With this implementation, I have added joblib and numba as dependencies (I also updated the requirements and setup, though it has to be tested and specific versions specified); I hope this is OK;
Due to the use of f-strings, these additions are not compatible with Python 3.5; this can be easily modified if needed, but it seems that other places in the codebase already break compatibility, so I left it this way since it's more readable;
No tests made yet; any help regarding this would be valuable.

The only exception is in some of the new effect class methods, where I believe the code is not clear enough, not precisely defined, or no type hints are provided. I am referring to, for example, the apply_to() method in all the PSF classes. I just tried copying the other classes, but it is unclear what the parameter obj should be. I see that there is a class attribute that defines the possible convolution classes, but even this is not very clear. More so, the base classes used in isinstance(obj, ...) do not define some of the methods (even if only as abstract methods) used subsequently, which leads to warnings on my IDE and overall some confusion as to whether specified object even provides said API. As such, I believe it would be extremely useful to clearly define which are the possible objects used as an argument in apply_to, possibly make abstract base classes accordingly, and guarantee that these are correctly inherited where needed. The extensive use of kwargs and objects holding various parameters that probably should be class attributes (e.g. meta) also makes this even more confusing and prone to errors or unexpected behaviour. I believe at this point I'm already missing the focus of this Issue, but I just wanted to provide some feedback on the difficulties experienced by someone (me) who is new to this codebase and tries to implement new functionality.

P.S.: (Unrelated) I don't know if you are also working in the skycalc_ipy codebase, but a few weeks ago I made a PR that fixes using this package and ScopeSim at my campus. It includes very simple modifications, so I believe it shouldn't cause any problems, and it would be really helpful if it could be merged.

hugobuddel commented 1 year ago

Sorry for not getting back to you! I'm moving and it is a bit chaotic (sorry again). I'd love to take a closer look. Perhaps you could make a Pull Request for fvpsf branch? That's fine even if you do not intend to merge it as-is, just use "Draft: " as title or something like that.

Yeah I feel your pain that not everything is as easy to understand as it should be. I'm now mainly focused on making it easier to test and maintain ScopeSim and associated packages, that would make it easier to make nicer ABC's (Abstract Base Classes, such a nice acronym) and make other things easier.

Sorry I missed your skycalc_ipy PR! Not sure how that happened.

hugobuddel commented 1 year ago

Thanks for your code @joao-aveiro , looking good! Again my apologies for not getting back to you earlier.

And also thanks for highlighting how we can improve the codebase, I'll take your comments to hearth and work on making things more clear; I myself am still often confused too! More type annotations should be a good start.

I'm confident your code correctly does what you want, as it is well written and you are happy with it. The example you wrote provides sufficient basis for an example notebook and some tests. The only real question I have is about a factor of 1.5, see #208 .

But I now realize that I'm missing domain knowledge to formulate sensible questions, for example with reusability of the psf grid. In particular, could you teach me what is the physical effect that causes the PSF to vary?

If the sky causes the PSF to vary, then it would make physical sense to be able to reuse the same psf_grid.fits for different instrument observing the same part of the sky, but not for a different part of the sky.
If the instrument causes the PSF to vary, then it might make physical sense to reuse the same grid for different parts of the sky, but not for different instruments.
Or would one always need a new PSF grid for each observation?

There is still some work to do w.r.t. different FoV's and different wavelengths, but I'm fine with doing that later. I think everything is still 'correct' if we would not implement that. Perhaps I should take the lead in this, unless you have a pressing need for this functionality yourself.

Don't worry about supporting Python 3.5. I've put some effort into modernizing ScopeSim and related packages and decided to only support Python 3.8+. I'm also updating the CI to test on linux, macos, and windows, etc. That will give me more confidence in making changes.

Perhaps it would be good if I'll write some example notebooks, documentation, and tests for the FieldVaryingPSF. Then you can verify that I understood your code correctly :-).

Related, there is also interest from MICADO for a field-varying PSF. I will discuss this with them next week and have already pointed them to this issue. I'll keep you informed; perhaps it would be good to also have a talk between the three of us.

joao-aveiro commented 1 year ago

The 1.5 factor is indeed an error left forgotten from some tests I did. My apologies and thank you for pointing it out.

Regarding the questions/comments regarding the PSF variability, I am by no means an expert but I believe one of the most important factors is angular anisoplanatism and the resulting degradation in wavefront correction in AO-powered systems, which varies across the field. Since the PSF here is modelling the telescope/optical system, I don't believe that different observation targets will affect the PSF directly (thought I might be wrong). Though I say they won't affect it directly because they affect it indirectly, e.g. if the seeing conditions are different or the guide stars used differ, the AO correction performance will vary and so will the PSFs. But this is something that is also true for the field-constant PSFs, and which is out of the scope of the capabilities of ScopeSim (since it doesn't simulate the systems' PSFs). So I believe it would be acceptable to provide, for each system, some kind of example FV-PSF for median conditions (which is probably the approach used currently with field-constant PSFs in irdb). Otherwise, for any specific observation targets and/or observation conditions, it would be the user's responsibility to provide the most suitable PSF grid for that specific case (which is what I am doing).

One thing that I believe might be wrong in both my code and in the other PSF classes is the way the PSFs are resampled (=rescaled=interpolated) to the FOV pixel width. They will probably work adequately for most cases, but, for example, if by chance the PSF does not have an even number of pixels and/or the centre pixel is not psf.shape // 2, I believe some error will be introduced. In the PR there is a custom rescaling approach that I made to try to correct this, but I believe that it is not 100% correct either. Though I have a modified and, hopefully, corrected method that I will soon push once I am confident it is working as planned.

Finally, I am glad this might be useful to the MICADO team and, hopefully, to other users. The following weeks will be quite difficult for me, but maybe we can have a quick chat one of these days, as you suggested. Here is my email: javeiro@fe.up.pt; feel free to send me an email so we can discuss this or other matters in more detail.

hugobuddel commented 1 year ago

Thanks for the explanation. Thinking about it more after writing my question (sometimes that helps), I realized indeed that we're talking about the residual PSF after AO correction, which will vary per per exposure (because the atmosphere varies). My personal work with field varying PSFs is limited to Hubble images (using tinytim), where the situation is very different (no atmosphere). For now I conclude that we should not worry about generalizability too much now, and go ahead with what you have made, see what use-cases it works well with, and then take it from there.

One thing that I will do is create a test case where I simulate the same field with the different PSF methods and try to get them all to agree. That is, if we set all PSFs in the grid to be the same Gaussian, then the result should be the same for a FieldConstantPSF with the same Gaussian, etc. Or at least, we should be able to explain the differences.

About the resampling problem: thanks for noting this. For MICADO it is necessary to have sub-pixel accuracy, up to 1/30th of a pixel IIRC. So it will be essential to get the resampling right. Perhaps #158 is relevant here?

hugobuddel commented 1 year ago

The plan we have at the moment:

In general for field-varying PSF:
- [ ] HB: Create a test that sets all PSFs in grid to be the same and check whether result is equivalent to the fixed-psf. Do this for all PSF methods
- [ ] HB: Create a test to see whether the flux is conserved
- [ ] HB: Create a test that allows us to see whether the desired sub-pixel accuracy is reached, e.g. using the sub-pixel perfect centroiding from poppy: https://github.com/mperrin/poppy/blob/master/poppy/fwcentroid.py
- [ ] (later) HB: Improve the subpixel accuracy.
For this PR:
- [ ] HB: Fix the off-by-one-error #209 (and then perhaps replace the transformations in this branch)
- [ ] João: Provide some slides that describe the method in this PR
- [ ] HB: Add varying wavelength support to field-varying PSF-s
- [ ] HB: Merge this code
For Maria's code:
- [ ] HB: Learn from Mariia's notebooks.
- [ ] Mariia / HB: Create a better implementation

carmeloarci commented 1 year ago

Hi, please, keep me in the loop about the fv PSF.