desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
35 stars 24 forks source link

Promote dark errors #2162

Closed Waelthus closed 7 months ago

Waelthus commented 7 months ago

This mainly promotes the case of not finding a usable DARK in DESI_SPECTRO_DARK to a critical I/O error, thus crashing the code in that case. I guess we might want an option to use the previous DESI_SPECTRO_CALIB fallback instead when needed.

Waelthus commented 7 months ago

added additional arguments so that we could decide on this behaviour at runtime, currently does fall-back by default, but setting the additional argument during desi-preproc runs would make it fail instead. Not sure if it's easier to give the offline pipeline an additional arg for failing or nightwatch one for not-failing, but this could be easily adapted...

sbailey commented 7 months ago

@Waelthus apologies for the delay getting to this. Thanks for adding this as a configurable option.

@sybenzvi @schlafly @julienguy I suggest that we switch the default to fail if the $DESI_SPECTRO_DARK isn't found, and put the burden on Nightwatch to use the preproc(..., fail_on_dark_not_found=False) option. That would allow Nightwatch to be flexible about processing new data ASAP even if we don't have a new dark model, but it would require proactive human intervention to add a new dark to $DESI_SPECTRO_DARK after a CCD configuration change. If we really need daily ops processing ASAP before getting an updated dark, we can always update $DESI_SPECTRO_DARK with a previous dark, but that at least forces us to make a specific decision instead of defaulting to a potentially incorrect decision that we don't notice amidst N>>1 other log messages.

This would require updates to nightwatch.run.run_preproc (different PR), desispec.scripts.qproc (could be added to this PR), and possibly nightwatch.run.run_qproc depending upon what default we set in qproc.

Thoughts?

schlafly commented 7 months ago

That sounds like a good approach to me.

sybenzvi commented 7 months ago

I agree that specifying fail_on_dark_not_found=False is probably the best option in Nightwatch for now, as it's likely that we're going to have nights when $DESI_SPECTRO_DARK at KPNO is behind updates at KPNO. I'll make the PR in Nightwatch for preproc and will watch here for comments about qproc.

sbailey commented 7 months ago

@Waelthus please change the preproc and calibfinder default to fail if the dark is missing from $DESI_SPECTRO_DARK, and update desispec.qproc to override that default so that qproc won't fail if the dark is missing.

For the record, desihub/nightwatch#379 is the associated Nightwatch ticket, but we don't have PR there yet. We'll need to also have the nightwatch PR ready so that we can merge it at the same time as this PR.

sybenzvi commented 7 months ago

@Waelthus please change the preproc and calibfinder default to fail if the dark is missing from $DESI_SPECTRO_DARK, and update desispec.qproc to override that default so that qproc won't fail if the dark is missing.

For the record, desihub/nightwatch#379 is the associated Nightwatch ticket, but we don't have PR there yet. We'll need to also have the nightwatch PR ready so that we can merge it at the same time as this PR.

Actually, I think we don't need a Nightwatch PR after all. It seems run_preproc passes command line arguments to preproc rather than calling it internally (see lines 246-293 of run.py). Since--fail-on-dark-not-found must be specified to raise an exception in this update, not including the new argument means we default to the current non-failing behavior. Right?

Waelthus commented 7 months ago

Indeed this is the case things current behavior of this patch, but if I understood Stephen right, the plan would be to change the default to failing and then pass an arg to the nightwatch runs. The general alternative would be to pass the extra arg to the offline pipeline which I think would not require changes here or in nightwatch... I'm open to either approach and will anyway only be able to change the code tomorrow CET. If the consensus is to fail by default, I'd rename the flag to --fallback-on-darks-not-found (or any other suggestion that is made here) and invert behavior...

Am 24. Januar 2024 20:30:38 MEZ schrieb Segev BenZvi @.***>:

@Waelthus please change the preproc and calibfinder default to fail if the dark is missing from $DESI_SPECTRO_DARK, and update desispec.qproc to override that default so that qproc won't fail if the dark is missing.

For the record, desihub/nightwatch#379 is the associated Nightwatch ticket, but we don't have PR there yet. We'll need to also have the nightwatch PR ready so that we can merge it at the same time as this PR.

Actually, I think we don't need a Nightwatch PR after all. It seems run_preproc passes command line arguments to preproc rather than calling it internally (see lines 246-293 of run.py). Since--fail-on-dark-not-found must be specified to raise an exception in this update, not including the new argument means we default to the current non-failing behavior. Right?

-- Reply to this email directly or view it on GitHub: https://github.com/desihub/desispec/pull/2162#issuecomment-1908784487 You are receiving this because you were mentioned.

Message ID: @.***>

sybenzvi commented 7 months ago

Got it @Waelthus. OK, I'll follow your lead. If you change the default, I'll update Nightwatch tomorrow morning ET after you're done and then Stephen can merge the PR.

sbailey commented 7 months ago

Confirming, yes, I want to switch defaults so that Nightwatch is the less strict exception to the default, and the default for dailyops and reprocessing runs is the stricter "use a dark from DESI_SPECTRO_DARK or otherwise stop because you might be doing the wrong thing".

Waelthus commented 7 months ago

ok, added the arg to qproc and reversed the previous behaviour, failing by default and falling back if an arg --fallback-on-dark-not-found is set when running desi_preproc or desi_qproc.

sybenzvi commented 7 months ago

I just tested a branch of Nightwatch against the new options and it works, and I have a PR waiting to be merged. So you can merge this whenever you are ready.