Reconcile level 1 file between pipeline and Steve's results for 20220901

mgalloy commented 1 year ago

The file to check is 20220901.182014.02.ucomp.1074.l0.fts.

Do the run with intermediate products and interpolating flats.
Produce difference images.
Produce % difference images.

Verify my pipeline against the results from Steve at (in order):

[x] after sum (after gain)
[x] after demodulation
[x] after distortion
[x] after debanding
[x] final intensity
[x] final polarization
[x] final background

mgalloy commented 1 year ago

There is a significant difference under the occulter, under the post, and outside the field stop for the percentage difference image, whereas difference image shows mainly outside the field stop.

Things to check/update:

[x] RAW_EXTS FITS keyword is not reporting correctly.
[ ] There is a large difference in the first row of the gain corrected images. Is there a shift?
[ ] Are the flats averages over modulation state?
[x] Are the darks averaged over wavelength?
[x] Move the dark correction before the camera correction to correspond to Steve's pipeline.
[ ] Move averaging the data after the gain correction to correspond to Steve's pipeline.
[ ] Do the ROTATE and REVERSE operations match up correctly? i.e. is rotation done before a step in one pipeline, and after the step in the other?
[ ] Is the flat Steve's pipeline using the same as the one I am using?
[x] Try double precision.
[ ] Are there differences between number of negative pixels?
[x] Dark interpolation?
[x] Flat interpolation?

Here is the initial difference images after the gain correction steps. Difference:

ucomp-diff-aftergain

Percent difference:

ucomp-perdiff-aftergain

mgalloy commented 1 year ago

Swapping the apply gain and camera correction steps eliminates the "Steve high" section of the first row. There is still a "Mike high" portion on each side of the first row.

mgalloy commented 1 year ago

The ucomp_average_darkfile routine should get the darks averaged over wavelength, as long as gain mode and exposure time match. Then

dark_image = mean(dark_image, dimension=3)

in ucomp_make_darks averages over polarization state. So believe the darks are averaged as much as they can be. It should leave a fltarr(1280, 1024, 2) array.

mgalloy commented 1 year ago

Steve and I find difference numbers of negative pixels in the center wavelength of the test image. Pink=Mike has an unmatched negative pixel, Cyan=Steve has an unmatched negative pixel.

This doesn't match the pattern of the error because, for one, the lower left outside the field does not show much difference despite differences there in the final data.

mgalloy commented 1 year ago

Doing the entire processing in double precision doesn't seem to change things.

detoma commented 1 year ago

Thank you Mike. This confirms what we suspected. The differences we see are larger than the accuracy of the float vs. double. I wonder if there is a bug or a difference between the codes that we have not identified yet. Your approach to change one step at a time is the correct one.

bberkeyU commented 1 year ago

Mike, are you interpolating darks? I am not sure how to check this with the Boulder computers currently down. Based on the log files, your pipeline interpolates darks for both the science and flat data. And Steve uses the closest dark in time.

With the variation, we see dark to dark, I think this can explain what we see in the pipelines. Fairly similar results where there are photos and very different results where the detector noise dominates.

Mike's FITS header:

RAWFILE = '20220901.182014.02.ucomp.1074.l0.fts' / raw file                     
RAWEXTS = '2,8,14,20'          / extension(s) used from RAWFILE                 
RAWDARK1= '20220901.175207.43.ucomp.l0.fts' / raw dark filename used, wt 0.27   
DARKEXT1=                    2 / ext in 20220901.ucomp.dark.fts used, wt 0.27   
RAWDARK2= '20220901.183037.30.ucomp.l0.fts' / raw dark filename used, wt 0.73   
DARKEXT2=                    3 / ext in 20220901.ucomp.dark.fts used, wt 0.73   
FLTFILE1= '20220901.174648.65.ucomp.1074.l0.fts' / name of raw flat file used   
FLTEXTS1= '2,8,14,20'          / ext in 20220901.174648.65.ucomp.1074.l0.fts use
MFLTEXT1=                  -40 / ext in 20220901.ucomp.flat.1074.fts, wt 0.75   
FLTFILE2= '20220901.200225.22.ucomp.1074.l0.fts' / name of raw flat file used   
FLTEXTS2= '2,8,14,20'          / ext in 20220901.200225.22.ucomp.1074.l0.fts use
MFLTEXT2=                  -34 / ext in 20220901.ucomp.flat.1074.fts, wt 0.25

But based on Steves SVN log file:

Data File:  20220901.182014.02.ucomp.1074.l0.fts
Dark File for Data:  20220901.183037.30.ucomp.l0.fts
Flat File before:  20220901.174648.65.ucomp.1074.l0.fts
Dark File for Flat before:  20220901.175207.43.ucomp.l0.fts
Flat File after:  20220901.200225.22.ucomp.1074.l0.fts
Dark File for Flat after:  20220901.200756.88.ucomp.l0.fts
med_back:       11.7510

detoma commented 1 year ago

Thank you Ben! I thought the flats and darks were compared a while back, apparently not. MIke can you use the same dark Steve uses to test if it is the source of the difference?

My recollection is that for UCoMP we decided to interpolate between darks.

Steve: What shall we do?

bberkeyU commented 1 year ago

Mike should be able to re-run the day with interpolate darks turned off. This should give the same results Steve has and probally won't require monkeying with the pipeline to force specific flats and darks.

I believe we do want to interpolate flats and darks in the production pipeline, but if we are looking for pixel-to-pixel similarities between the math used in the pipeline we either need to teach Steves to interpolate darks or turn it off in Mikes pipeline.

mgalloy commented 1 year ago

Not interpolating darks seems worse. Difference:

Percent difference:

bberkeyU commented 1 year ago

Which darks and flats did the pipeline use?

mgalloy commented 1 year ago

Here is what is used for extension 2 [1074.70 rcam]:

RAWFILE = '20220901.182014.02.ucomp.1074.l0.fts' / raw file
RAWEXTS = '2,8,14,20'          / extension(s) used from RAWFILE
RAWDARK1= '20220901.175207.43.ucomp.l0.fts' / raw dark filename used, wt 1.00
DARKEXT1=                    2 / ext in 20220901.ucomp.dark.fts used, wt 1.00
FLTFILE1= '20220901.174648.65.ucomp.1074.l0.fts' / name of raw flat file used
FLTEXTS1= '2,8,14,20'          / ext in 20220901.174648.65.ucomp.1074.l0.fts use
MFLTEXT1=                  -40 / ext in 20220901.ucomp.flat.1074.fts, wt 0.75
FLTFILE2= '20220901.200225.22.ucomp.1074.l0.fts' / name of raw flat file used
FLTEXTS2= '2,8,14,20'          / ext in 20220901.200225.22.ucomp.1074.l0.fts use
MFLTEXT2=                  -34 / ext in 20220901.ucomp.flat.1074.fts, wt 0.25

And here is for extension 5 [1074.70 rcam]:

RAWFILE = '20220901.182014.02.ucomp.1074.l0.fts' / raw file
RAWEXTS = '5,11,17,23'         / extension(s) used from RAWFILE
RAWDARK1= '20220901.175207.43.ucomp.l0.fts' / raw dark filename used, wt 1.00
DARKEXT1=                    2 / ext in 20220901.ucomp.dark.fts used, wt 1.00
FLTFILE1= '20220901.174648.65.ucomp.1074.l0.fts' / name of raw flat file used
FLTEXTS1= '5,11,17,23'         / ext in 20220901.174648.65.ucomp.1074.l0.fts use
MFLTEXT1=                  -37 / ext in 20220901.ucomp.flat.1074.fts, wt 0.75
FLTFILE2= '20220901.200225.22.ucomp.1074.l0.fts' / name of raw flat file used
FLTEXTS2= '5,11,17,23'         / ext in 20220901.200225.22.ucomp.1074.l0.fts use
MFLTEXT2=                  -31 / ext in 20220901.ucomp.flat.1074.fts, wt 0.25

I don't know why I'm getting negative extensions into the master flat — I will check into that.

bberkeyU commented 1 year ago

Based on Steves log, he used a different data dark 20220901.183037.30.ucomp.l0.fts vs 20220901.175207.43.ucomp.l0.fts.

mgalloy commented 1 year ago

OK, that's another thing to look into. My dark is 28 minutes from the science image, whereas Steve's is only 10 minutes. It looks like my pipeline always uses a dark before the science time when not interpolating darks. Let me fix and re-run.

mgalloy commented 1 year ago

I was able to login from CG-1, but can't seem to now that I am home. I will do this as soon as I can access the servers again.

bberkeyU commented 1 year ago

I have also having intermitten issues logging in from off campus. I was able to access things 1 hour ago.

mgalloy commented 1 year ago

That seems like it! This is choosing the closer dark (10 minutes afterwards, instead of 28 minutes before). Difference:

Percent difference:

ucomp-perdiff-aftergain-fixed-dark-choice

Be careful when choosing your dark(s).

There are differences in a few pixels, probably the difference in hot pixel correction.

mgalloy commented 1 year ago

Of course, we are not out of the woods yet. After debanding, the difference is:

ucomp-diff-afterdeband-fixed-dark-choice

And percent difference:

ucomp-perdiff-afterdeband-fixed-dark-choice

mgalloy commented 1 year ago

Next step is after the demodulation. Difference is:

Percent difference is:

The percent difference looks pretty good, but there are definitely high values close to the occulter in my version.

bberkeyU commented 1 year ago

Picking on Tcam, modulation 1 pixel 512, 512. After the apply gain step, the two pipelines have a percent difference for the 6 file extensions of:

3.077500848820327e-07
1.329429919342374e-07
2.6630719094805833e-07
1.971450052874096e-07
3.411597538887815e-07
1.5889296997763506e-07

Then after the polarization step. Picking of Tcam, Stokes I, 512, 512 the percent differences increase to the following across the 6 extensions.

0.0003196100848531631
0.0008743953254036095
0.0019632026566753966
0.00024136517195470837
0.0011259853098148636
0.0010167200818203231

I don't yet know how to tell if this is just propagation of floating point errors or some other issue.

mgalloy commented 1 year ago

Between the gain correction and demodulation, in my pipeline:

continuum correction (currently empty)
demodulation

In Steve's pipeline:

demodulation

A couple of things to check:

[x] my Dmx_Temp_Coefs.sav has 13 wave regions and Steve's has 9, maybe the indexing is off?
[x] using the demodulation coefficients requires the TU_MOD temperature, maybe they are different? Steve's pipeline gets TU_MOD from the last extension header, which doesn't have TU_MOD — is he using 0.0?
[x] does Steve's data array have 6 dimensions and mine 5?
[x] is UCoMP_Get_Dmatrix really the same as what I am doing?
[x] try turning off quick demodulation
[x] is Steve actually using the 1083 demodulation coefficients every time because FILTER is a string representation of an integer and is compared to floating point values in UCoMP_Get_Dmatrix?

mgalloy commented 1 year ago

The demodulation coefficients stored in the file seem to be exactly equal, for 1074 at least:

IDL> mike_1074 = mike_dmx_coefs[*, *, -3, *]
IDL> steve_1074 = steve_dmx_coefs[*, *, -3, *]
IDL> print, array_equal(my_1074, steve_1074)
   1

mgalloy commented 1 year ago

@bberkeyU Did you say something about a log from Steve's code? It looks like Steve prints out the TU_MOD temperature he uses for demodulation (line 156 of UCoMP_Level1_Process.pro):

if debug eq 'yes' then print,'mod_temp: ',mod_temp

I would like to compare that value to what I use. We get a difference after demodulation and there isn't much to that.

bberkeyU commented 1 year ago

Based on the UCoMP_Level1_Process.pro attached to Steve's email with the various .sav files, I see that as line 161. I believe debug was off during the SVN committed run, so we dont see any of the debug messages but we do see things related to the flats and darks used. I think I pasted the only section that deals with 182014.fts out in the conservation above.

20220901_log.txt

bberkeyU commented 1 year ago

Do the two distortion surface fitting codes do the same thing? I can't really follow the math but it looks like the two pipelines setup eval_surface calls with different inputs. Also, Mike uses doubles whereas Steve use floats. I am guessing the call setup is related to how Mike vectorized the surface evaluation, and likely gives the same results.

ucomp_eval_surf(dx0_c, dindgen(nx), dindgen(ny))

vs

x = rebin(findgen(nx),nx,ny)   ;create equispaced grid of size nx by ny
y = transpose(rebin(findgen(ny),ny,nx))

dat0 = eval_surf(dx0_c,x,y)

detoma commented 1 year ago

For this I would use double precision but I do not think it would make a big difference from using floating point.

It seems the x- and y- arrays are one-dimensional in one case and two-dimensional in the other.

mgalloy commented 1 year ago

Using 0.0 for TU_MOD doesn't fix the difference:

And percent difference:

mgalloy commented 1 year ago

According to comments in FITS_READ:

If there is an INHERIT keyword set to 'T' (there isn't) and /NO_PDU is not present, then the primary header is combined with the extension header
versions before 2007 automatically used /PDU, which would combine the primary header with the extension header even if INHERIT is not set

I think Steve must be using a pre-2007 version of FITS_READ.

bberkeyU commented 1 year ago

What does the pipeline do with TU_MOD. I just assumed it tested in quality. But otherwise ignored.

bberkeyU commented 1 year ago

I see where TU_MOD is used now in ucomp_get_dmatrix.pro:

dmx = dmx_coefs[*, *, wave_region_index, 0] + mod_temp * dmx_coefs[*, *,wave_region_index, 1]

mgalloy commented 1 year ago

I get the same result using quick demodulation as the naive implementation.

mgalloy commented 1 year ago

Steve has a 6-dimensional array:

fltarr(nx, ny, n_pol_states, n_wavelengths, n_beams, n_cameras)

And I have a 5-dimensional array:

fltarr(nx, ny, n_pol_states, n_cameras, n_extensions)

I just have more extensions than he has wavelengths because there is an RCAM and TCAM extension for every wavelength in my array.

mgalloy commented 1 year ago

Steve uses a slightly different pure IDL code to perform the demodulation. I changed my naive implementation to his:

dc = transpose(data, [0, 1, 3, 4, 2])
non_pol_dims = product(dims[[0, 1, 3, 4]], /preserve_type)
dc = reform(dc, non_pol_dims, dims[2])   ; reform to n x n_pol_states array
dc = dmatrix ## dc[0L:non_pol_dims - 1L, *]
dc = reform(dc, dims[[0, 1, 3, 4, 2]])
data = transpose(dc, [0, 1, 4, 2, 3])

I get the same differences, though.

I should do some timings to compare the C implementation, to Steve's dimensional transforming method, to the naive implementation.

mgalloy commented 1 year ago

Found the difference in demodulation. Steve reads the FILTER keyword with (line 128 of UCoMP_Level1_Process):

region = sxpar(data_header,'FILTER')

This will yield a string representation of the integer wave region name, in our case '1074'. Then, UCoMP_Get_Dmatrix is called (line 163):

Dmx = UCoMP_Get_Dmatrix(Dmx_coefs,mod_temp,region)

In UCoMP_Get_Dmatrix, region is compared to the wave regions:

regs = [530.3,637.4,656.28,691.8,706.2,789.4,1074.7,1079.8,1083.0]
ind = where(regs eq region)

But, region will never equal any of those wave regions (except 1083). So ind will either be -1 or 8, which are equivalent when indexing into a dimension of size 9 — you will always get the demodulation coefficients for 1083.

If I change my pipeline to always use the demodulation coefficients for 1083, I get the following difference:

And the following percent difference:

mgalloy commented 1 year ago

The continuum subtraction comparison seems pretty good, but does have the hot pixel artifacts and some issues on the top and bottom edges. Difference:

And percent difference:

mgalloy commented 1 year ago

There is quite a difference in the final results. The steps that could introduce this are:

apply alignment
combine cameras
polarimetric correction

Steve's center wavelength intensity is much sharper than mine:

Mine:

The difference:

The percent difference:

mgalloy commented 1 year ago

Things to check/try:

[ ] try combining interpolations for centering and rotating image (this should be done anyway to limit the number of interpolations)
[ ] use Steve's center finding method (allow to choose by config file)
[x] check to make sure I find backgrounds for center finding correctly
[x] using ABS in UCOMP_RADIAL_DERIVATIVE in pt_weights[s] = abs(max(rad, imax))
[x] using 100 vs 360 scans in UCOMP_RADIAL_DERIVATIVE
[ ] my initial center guess is (nx - 1)/2 and (ny - 1)/2 and Steve's is nx/2 and ny/2
[ ] is there a difference in UCOMP_PARABOLA and PARABOLA? I remember there being an issue I had with PARABOLA.
[ ] Steve and I are off by close to 0.5 pixel in both cameras in both x and y. Is it a (x - 1)/2 vs x/2 issue?
[ ] I have a REVERSE(im, 2) in UCOMP_L1_APPLY_ALIGNMENT, maybe that should be done earlier?
[ ] on line 395 of UCoMP_Level1_Process.pro, Steve flips the background image upside down with the comment "sum Background (in Stokes I only) over wavelength", is this why my YOFFSET is the opposite sign as his?

mgalloy commented 1 year ago

I was just using the intensity background, and not averaging the background by polarization state.

mgalloy commented 1 year ago

I have agreement between Steve and my pipelines with the following caveats/changes:

I apply small offsets to make sure the difference is in the center finding and not later in the pipeline (RCAM [xoffset=-0.044500024, yoffset=-0.0112] and TCAM [xoffset=0.076490071, yoffset=0.06673])
using EPHEM2 for the p-angle (21.212 for SUN vs 21.224 for EPHEM2)
changed default radius to 330.0 (from 350.0) and dradius to 80.0 (from 40.0)
using center definition of nx/2 and ny/2 instead of (nx-1)/2 and (ny-1)/2
changed number of scans from 360 to 100 when finding radial derivative
not using ABS when finding maximum radial derivative
using Steve's method of performing centering and p-angle rotation in one interpolation (I should do this anyway), though there is a bug in Steve's code which I think is insignificant
center finding done in double precision in my code and float in Steve's code
chi squared uses a different definition and not directly comparable
I moved the location of my flip in the y-axis to the end of the distortion correction

The difference in the final result (center line intensity):

The percent difference:

Despite getting good agreement on the actual images, the reported center offsets are off. Mine are:

XOFFSET0=               -1.650 / [px] RCAM occulter x-offset
YOFFSET0=                1.370 / [px] RCAM occulter y-offset
RADIUS0 =              333.275 / [px] RCAM occulter radius
FITCHI0 =                0.373 / [px] chi-squared for RCAM center fit
XOFFSET1=               -2.601 / [px] TCAM occulter x-offset
YOFFSET1=                2.777 / [px] TCAM occulter y-offset
RADIUS1 =              333.355 / [px] TCAM occulter radius
FITCHI1 =                0.571 / [px] chi-squared for TCAM center fit

Steve's are:

 XOFFSET0=             -1.69450 / [px] Occulter X-Offset 0
 YOFFSET0=             -1.38120 / [px] Occulter Y-Offest 0
 ORADIUS0=              332.616 / [px] Occulter Radius 0
 FITCHI0 =            0.0166425 / [px] Chi-squared for image 0 center fit
 XOFFSET1=             -2.52451 / [px] Occulter X-Offset 1
 YOFFSET1=             -2.71027 / [px] Occulter Y-Offest 1
 ORADIUS1=              332.791 / [px] Occulter Radius 1
 FITCHI1 =            0.0173295 / [px] Chi-squared for image 1 center fit

There are several differences still:

For some reason, his reported YOFFSET{0,1} are the opposite sign as mine.
My radii are also about 0.5 pixel larger than Steve's.
My post finding is not working reliably.

mgalloy commented 1 year ago

With changes to both my and Steve's pipeline, we now agree again. Difference in center line intensity for test image at 20220901.182014:

And percent difference:

These are adjusted to make the center finding agree with:

rcam.occulter_center += [0.010, 0.001]
tcam.occulter_center += [0.014, -0.001]

mgalloy commented 1 year ago

The center line background agrees fairly well, though has a faint ring around the occulter. Difference:

And percent difference:

mgalloy commented 1 year ago

Polarization also agrees well. Center line Q percent difference:

Center line U percent difference:

Center line V percent difference:

Center line Q absolute difference:

Center line U absolute difference:

Center line V absolute difference:

NCAR / ucomp-pipeline

Reconcile level 1 file between pipeline and Steve's results for 20220901 #84