Validation of Physical Layer Model with simulations

jeipollack commented 4 months ago

To validate the Physical Layer PSF model, we will use simulations that include a Zernike prior. Metrics and plots should be generated and compared with those reported in Tobias' thesis (section 6).

The simulations are located on Jean-Zay: /gpfswork/rech/ynx/commun/data/euclid_sims.

Check out a new branch from case_study_psf_decontamination to use for validation
Update the configuration files: data_config.yaml, training_config.yaml, etc. with the configuration settings described in Tobias' thesis
Run WaveDiff for each dataset
Produce similar plots as found in the Chapter 6: Results section
Evaluate whether the results resemble those in the thesis

nadamoukaddem commented 4 months ago

What does 'SFE' stand for in the name of datasets, and are the 'nm' numbers for SEDs resolution?

jeipollack commented 4 months ago

The answers (SFE=Surface Errors and XX nm mean WFE RMS are error maps that were used to build the Zernike prior) to your questions are explained in section 6.2 of Tobias' thesis.

nadamoukaddem commented 4 months ago

Thank you, Jennifer. I am reading Chapter 6 from the beginning so I don't miss anything. I have some questions.

Tobias mentioned in his thesis that he used only the dataset with SFE. Why do we need the one with no_SFE?
What is the number of stars in these simulated datasets?
Are the plots to be produced only for the waveDiff-polygraph model?
Why do we have a different number of Zernikes in the training configuration (15) compared to the metrics configuration (45)?
The same goes for the number of bins: 8 for training and 20 in metrics.
What does the parameter M=64 mentioned in the thesis refer to?

jeipollack commented 4 months ago

Hi Nada, it’s good that you read chapter 6 and aiming to develop some understanding surrounding this task.

As I believe one of your goals is to do research, i want to remind you that a fundamental part of being a researcher is “Reasoning”.

So, with your questions first try to determine the answer yourself. To aide you, you can look at other parts of Tobias’ thesis by doing keyword searches. For some questions the answer may be obvious, whereas for others your reasoning may come with some uncertainty. It is those questions where you think your reasoning is uncertain that you should ask for clarification.

The point is don’t just ask what is something or why. Try to determine this yourself and present what you think along with your question.

I will answer one of your questions directly.

You need to run WaveDiff on the model with the physical layer. You will see which one it is in the e training config file in the case study branch.

On Mon, 8 Apr 2024 at 11:25, nadamoukaddem @.***> wrote:

Thank you, Jennifer. I am reading Chapter 6 from the beginning so I don't miss anything. I have some questions.

Tobias mentioned in his thesis that he used only the dataset with SFE. Why do we need the one with no_SFE?

What is the number of stars in these simulated datasets?

Are the plots to be produced only for the waveDiff-polygraph model?

Why do we have a different number of Zernikes in the training configuration (15) compared to the metrics configuration (45)?

The same goes for the number of bins: 8 for training and 20 in metrics.

What does the parameter M=64 mentioned in the thesis refer to?

— Reply to this email directly, view it on GitHub https://github.com/CosmoStat/wf-psf/issues/133#issuecomment-2042276073, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXRVNTAAC35ASPNMHMUKGLY4JO7LAVCNFSM6AAAAABFWFMZ2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGI3TMMBXGM . You are receiving this because you authored the thread.Message ID: @.***>

nadamoukaddem commented 4 months ago

I noticed the M parameter in the thesis, but it's not in the configuration file, so I was wondering why. You mentioned comparing the plots to those in the thesis, which were done with the dataset with SFE. When I apply it to the no_SFE dataset, what should I compare it to? Additionally, Tobias used different models of WaveDiff in his thesis, but the configuration files only mention implementation for the wavediff polygraph. Perhaps in older versions of WaveDiff, the other models were implemented but I've never worked with older versions. I'm asking questions because we don't have much time for this case. Could you please provide a timeline for finishing this task?

jeipollack commented 4 months ago

Have you launched any runs at all? Before asking you to do this task, I tested that running was possible for one of the datasets. So, it should be possible to do so even with gaps in understanding.

WaveDiff v2.0 does not include these additional models besides: Polychromatic model ("poly") and now this Polychromatic with the Physical Layer PSF ("physical_poly") model. As I wrote in the description and as we discussed in the past PSF meetings, the goal of this task is to validate the new PSF model with the physical layer.

The different options: "complete", "parametric", and "non-parametric" are set with this option in the configuration file. While I have only tested with the "complete" setting the other settings should work. If they don't work, let me know.

@tobias-liaudat is going to share the private repository which you can fork. It contains a set of notebooks you can use to regenerate the plots with the new PSF model. Note there may be other adaptations you have to do besides adjusting the paths to the results.

nadamoukaddem commented 4 months ago

I launched the training part, but I set the batch size to 16 so that I don't encounter the ResourceExhausted error.

jeipollack commented 4 months ago

@nadamoukaddem can you share the error in your log here?

nadamoukaddem commented 4 months ago

This is the metrics configuration file: metrics_config.log and this is the ouptut: metrics_05_seed1.log

jeipollack commented 4 months ago

Hi @nadamoukaddem , the problem is that PSF model weights file is not found in the trained_model_path you provided. I copied a portion of the log below, where the code prints the path from where it will attempt to load the psf_model weights.

2024-04-15 11:01:55,005 - wf_psf.utils.read_config - INFO - Loading.../gpfswork/rech/ynx/uch76qv/tests/outputs/wf-outputs/wf-outputs-SFE-05nm-seed1/c$
2024-04-15 11:01:55,048 - wavediff - INFO - <wf_psf.utils.configs_handler.MetricsConfigHandler object at 0x1522d5adb070>
2024-04-15 11:01:55,066 - wf_psf.psf_models.psf_models - ERROR - PSF weights file not found. Check that you've specified the correct weights file in $
Traceback (most recent call last):

The files in this directory are:

(tensorflow-2.9.1+py3.10) [uuu68hq@jean-zay2: config]$  ls /gpfswork/rech/ynx/uch76qv/tests/outputs/wf-outputs/wf-outputs-SFE-05nm-seed1/psf_model/
checkpoint  psf_model_physical_poly_no_sfe_id-11_cycle2.data-00000-of-00001  psf_model_physical_poly_no_sfe_id-11_cycle2.index

These files do not have the same id_name specified in /gpfswork/rech/ynx/uch76qv/tests/outputs/wf-outputs/wf-outputs-SFE-05nm-seed1/config/training_config.yaml.

training:
  # ID name
  id_name: sfe_05nm_seed1

Another thing I noticed about this training_config.yaml file is that it is missing the configuration parameter: use_prior. You need to set this in order to use the Zernike Prior from these datasets else it will not. You should cancel all of your runs that were deployed, add use_prior=True to the training_config.yaml, and resubmit the jobs. It's my fault you have to do this because I forgot to commit this change last week. I just pushed it now to remote.

nadamoukaddem commented 4 months ago

Hi @jeipollack, I am having this error:

2024-04-16 16:13:56,641 - wavediff - INFO - #
2024-04-16 16:13:56,642 - wavediff - INFO - # Entering wavediff mainMethod()
2024-04-16 16:13:56,642 - wavediff - INFO - #
2024-04-16 16:13:56,642 - wf_psf.utils.read_config - INFO - Loading.../gpfswork/rech/ynx/uch76qv/tests/config/config_SFE_01_seed1/training_config.yaml
2024-04-16 16:13:56,664 - wf_psf.utils.read_config - INFO - Loading.../gpfswork/rech/ynx/uch76qv/tests/config/config_SFE_01_seed1/data_config.yaml
2024-04-16 16:14:00,347 - wavediff - INFO - <wf_psf.utils.configs_handler.TrainingConfigHandler object at 0x14ff0486b370>
2024-04-16 16:14:00,356 - wf_psf.data.training_preprocessing - INFO - Reading in Zernike prior...
2024-04-16 16:14:00,356 - wavediff - ERROR - Check your config file /gpfswork/rech/ynx/uch76qv/tests/config/config_SFE_01_seed1/configs.yaml for errors. Error Msg: 'zernike_prior'.
Traceback (most recent call last):
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/run.py", line 92, in mainMethod
    config_class.run()
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/utils/configs_handler.py", line 202, in run
    train.train(
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/training/train.py", line 295, in train
    psf_model = psf_models.get_psf_model(
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/psf_models/psf_models.py", line 155, in get_psf_model
    return psf_factory_class().get_model_instance(*psf_model_params)
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/psf_models/psf_model_physical_polychromatic.py", line 52, in get_model_instance
    return TFPhysicalPolychromaticField(
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/psf_models/psf_model_physical_polychromatic.py", line 106, in __init__
    self._initialize_parameters_and_layers(
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/psf_models/psf_model_physical_polychromatic.py", line 137, in _initialize_parameters_and_layers
    self._initialize_zernike_parameters(model_params, data)
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/psf_models/psf_model_physical_polychromatic.py", line 155, in _initialize_zernike_parameters
    self.zks_prior = get_zernike_prior(model_params, data)
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/data/training_preprocessing.py", line 342, in get_zernike_prior
    zernike_contribution_list.append(get_np_zk_prior(data))
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/data/training_preprocessing.py", line 222, in get_np_zk_prior
    data.training_data.dataset["zernike_prior"],
KeyError: 'zernike_prior'
2024-04-16 16:14:00,429 - wavediff - INFO - #
2024-04-16 16:14:00,430 - wavediff - INFO - # Exiting wavediff mainMethod()
2024-04-16 16:14:00,430 - wavediff - INFO - #

jeipollack commented 4 months ago

Can you check whether your training dataset that you are loading contains the key zernike_prior?

nadamoukaddem commented 4 months ago

The error I'm encountering is because when I pulled the latest changes you made, I didn't update the dataset in the data_config file.

Is there a way to determine the name of the output of WaveDiff before the execution so the plotting configuration runs automatically after the metrics?

jeipollack commented 4 months ago

Plotting can run after metrics evaluation for certain set ups.

On Tue, 16 Apr 2024 at 18:03, nadamoukaddem @.***> wrote:

The error I'm encountering is because when I pulled the latest changes you made, I didn't update the dataset in the data_config file.

Is there a way to determine the name of the output of WaveDiff before the execution so the plotting configuration runs automatically after the metrics?

— Reply to this email directly, view it on GitHub https://github.com/CosmoStat/wf-psf/issues/133#issuecomment-2059440003, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXRVNRJSJ3N53CQ5HRRCP3Y5VDTJAVCNFSM6AAAAABFWFMZ2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJZGQ2DAMBQGM . You are receiving this because you were mentioned.Message ID: @.***>

nadamoukaddem commented 4 months ago

These are the first plots for WaveDiff complete on the datasets with surface errors. These numbers are for one seed; I still need to average across all seeds. pixel_error e1_error e2_error wfe_error

nadamoukaddem commented 4 months ago

I ran WaveDiff in non-parametric mode, and I encountered this error:

2024-04-18 20:13:17,702 - wavediff - INFO - #
2024-04-18 20:13:17,702 - wavediff - INFO - # Entering wavediff mainMethod()
2024-04-18 20:13:17,702 - wavediff - INFO - #
2024-04-18 20:13:17,703 - wf_psf.utils.read_config - INFO - Loading.../gpfswork/rech/ynx/uch76qv/tests/config_non_param/config_SFE_10_seed1/training_$
2024-04-18 20:13:17,741 - wf_psf.utils.read_config - INFO - Loading.../gpfswork/rech/ynx/uch76qv/tests/config_non_param/config_SFE_10_seed1/data_conf$
2024-04-18 20:13:22,134 - wavediff - INFO - <wf_psf.utils.configs_handler.TrainingConfigHandler object at 0x15454341b6a0>
2024-04-18 20:13:22,141 - wf_psf.data.training_preprocessing - INFO - Reading in Zernike prior...
2024-04-18 20:13:22,387 - wf_psf.training.train - INFO - PSF Model class: `physical_poly` initialized...
2024-04-18 20:13:22,387 - wf_psf.training.train - INFO - Preparing Keras model callback...
2024-04-18 20:13:22,387 - wf_psf.training.train - INFO - Preparing Keras model callback...
2024-04-18 20:13:22,388 - wf_psf.training.train - INFO - Starting cycle 1..
2024-04-18 20:13:27,060 - wf_psf.training.train_utils - INFO - Starting non-parametric update..
2024-04-18 22:40:41,717 - wf_psf.training.train - INFO - Cycle1 elapsed time: 8839.329240560532
2024-04-18 22:40:41,718 - wavediff - ERROR - Check your config file /gpfswork/rech/ynx/uch76qv/tests/config_non_param/config_SFE_20_seed1/configs.yam$
Traceback (most recent call last):
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/run.py", line 92, in mainMethod
    config_class.run()
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/utils/configs_handler.py", line 202, in run
    train.train(
  File "/gpfswork/rech/ynx/uch76qv/.local/lib/python3.10/site-packages/wf_psf/training/train.py", line 409, in train
    ] = hist_param.history
AttributeError: 'NoneType' object has no attribute 'history'
2024-04-18 22:40:41,718 - wavediff - INFO - #
2024-04-18 22:40:41,718 - wavediff - INFO - # Exiting wavediff mainMethod()
2024-04-18 22:40:41,719 - wavediff - INFO - #

CosmoStat / wf-psf

Validation of Physical Layer Model with simulations #133