kevin218 / Eureka

Eureka! is a data reduction and analysis pipeline intended for time-series observations with JWST.
https://eurekadocs.readthedocs.io/
MIT License
58 stars 45 forks source link

[Bug]: WFC3 Exposure Counting #542

Closed jbrande closed 1 year ago

jbrande commented 1 year ago

FAQ check

Instrument

Other (any stage)

What happened?

For HST operations, the target direct images get counted as frames in the same way the spectroscopic images do in S3, leading to an extra exposure counted per orbit compared to the spectroscopic lightcurves. When working with multiple-orbit visits, this means using the meta.orbitnum array to operate on individual orbits within the dataset will fail due to an array length mismatch. The direct images should probably be skipped when making the orbitnum array. This should also be fixed for the framenum and batchnum arrrays.

The direct images are always the first exposure in an orbit, so maybe this is relatively simple.

Error traceback output

No response

What operating system are you using?

No response

What version of Python are you running?

No response

What Python packages do you have installed?

No response

Code of Conduct

taylorbell57 commented 1 year ago

It's been quite a while since I looked at this, but I think the direct images are supposed to be put in the cal folder.

https://eurekadocs.readthedocs.io/en/latest/ecf.html#topdir-inputdir-cal-dir

jbrande commented 1 year ago

Ah, I tried that earlier and it didn't work. In lib/util.py, when find_fits() gets called, it will pick one of the directories to pull files from. find_fits() doesn't seem to have any WFC3 specific handling. Then, in readfiles(), meta.segment_list just gets populated with the files in whichever directory find_fits() picked, so the code that does do WFC3 specific handling to search cal vs sci subfolders never gets called. So that's probably the real bug, and I only avoided it by putting all the images in the same folder.

taylorbell57 commented 1 year ago

Aha, I probably changed the find_fits() function to recursively check subfolders since I last ran WFC3. That's likely the bug indeed

taylorbell57 commented 1 year ago

Hmm, looking back at my old WFC3 reduction from July 2022, it looks like I hadn't needed to separate the direct, wavelength calibration files from the science files. Not sure what's going on

taylorbell57 commented 1 year ago

I just re-ran those old analyses and I indeed don't need to separate the direct images. I don't get any crashes at least and the lightcurves I get are reasonable looking to me (these are public G102 observations of KELT-11b).

fig4101_2D_LC fig4102_ch0_1D_LC

I'm on a semi-complicated mixture of the mad_axis branch and the dev/tjb branch which will fix several bugs (PRs submitted), but none of these bugs should have fixed WFC3 analyses. For reference, here's my S3 ECF (as a .txt file since GitHub doesn't trust the .ecf file extension): S3_wfc3.txt

Can you clarify a bit more where you're seeing a bug/crash and copy-paste the traceback?

jbrande commented 1 year ago

Sorry, let me clarify. Not separating the direct images from the spectroscopic files will run all the way through to S6 with no crashes, as you've just confirmed. The problem is, when you do this, the direct images get counted as exposures so your meta.orbitnum and meta.scandir arrays don't end up being the same lengths.

If the intended behavior is for the spectra and direct images to be separated, then we have a problem where Eureka! can't properly parse the directory structure. If the intended behavior is for the spectra and direct images to be in the same folder, we have a different problem where Eureka! miscounts meta.orbitnum/framenum/batchnum.

There's no traceback for this, really.

taylorbell57 commented 1 year ago

The intended behaviour is indeed to keep the spectra and direct images in the same folder, since trying to separate them manually is a real pain.

Can you provide some concrete, numerical examples for the incorrect meta.orbitnum/framenum/batchnum values? I'm having trouble wrapping my head around what values are ending up wrong, and I'm having trouble determining what this might impact.

jbrande commented 1 year ago

meta.scandir looks something like np.array([0,1,0,1,...]) for the total number of spectral exposures taken. meta.orbitnum looks something like np.array([0,0,...1,1,...2,2,...]) for n orbits where each orbit is of length n_spectral + 1, for each spectral exposure taken per orbit and the direct image taken at the beginning of that orbit.

For example, in the HD 209458 b transit from the HST demo, taken over 5 orbits, there are 215 spectral exposures (43 per orbit), and the meta.scandir array is of length 215 and can be directly used as-is to access data from individual scan directions. (e.g. mask = meta.scandir == 0, 1, etc) scan_color

However, the Eureka! populated meta.orbitnum array is of length 220 (44 total per orbit, including the 5 direct images taken at the beginning of each orbit, but are not useful in the analysis past S3 centroiding IIRC), and can't be used as-is to split the data into individual orbits: orbit_color_improper You can see that, left as-is, the extra exposure per orbit propagates through and so the first four exposures in the last orbit are marked incorrectly. I clipped the last five values off the end of the array to make the dimensions match so I could make an orbit level mask e.g. mask = meta.orbitnum == 0, 1, 2, etc (otherwise, the dimensions mismatch and you can't use it at all).

Here's how it should look, after manually removing one value from each orbit of meta.orbitnum: orbit_color_proper Each orbit is now wholly its own color.

Knowing which spectral exposures belong in each orbit and scan is important for calculating orbit- and scan-level systematics, such as the sinusoidal correction to the spatial scan offset I've been working on.

If needed, we can talk more about this in the meeting Monday. Never mind, whenever we meet next.

taylorbell57 commented 1 year ago

Aha, okay thanks a lot for these - this helped a lot! Unless the bug is immediately obvious to me in the next ~30 minutes, I probably won't end up having time to fix this for a week or two. Let me know here if you beat me to it

jbrande commented 1 year ago

I have some ideas, we'll see if it happens over the weekend.

taylorbell57 commented 1 year ago

Pretty sure I just patched it with the commit referenced above, but please check that it actually works.

jbrande commented 1 year ago

Yes! That looks like it's fixed things! orbit_color_patched

taylorbell57 commented 1 year ago

Wonderful! That patch will be merged soonish!