innosat-mats / MATS-L1-processing

Python code for calibrating MATS images
MIT License
0 stars 1 forks source link

Processing fails because of wrong binning size #102

Closed skymandr closed 1 year ago

skymandr commented 1 year ago

🐛 Description

L1b-processing works now, at least in the operational sense: QA needs to be performed, but the pipeline works. A few files are unprocessable at the moment, for the following reason:

the size of the image and the binning size are incompatible

This is raised here: https://github.com/innosat-mats/MATS-L1-processing/blob/84d06517eab86ab84c1bcbe8b66d217793560641/src/mats_l1_processing/L1_calibration_functions.py#L625-L633

First, two comments on this:

Now the problem: The way processing is implemented, if an uncaught error occurs, the whole file is failed and no calibrated images will be stored. This is intentional, since we want to make sure that we don't store garbage, and we want to have the opportunity to catch and fix bugs. Failed files can be re-run once the issue is fixed. (Each file contains just 1 hour of data now, so hopefully this delay is acceptable.)

We (@innosat-mats/molflow) could make an exception here, and continue processing the file in the hope that some images are processable, but to me this looks like something that should be caught in the processing step, before it reaches our code, and an appropriate error-flag returned. My suggestion is therefore that you (@innosat-mats/scientific) implement error handling for this in the primary code-base.

🔱 To reproduce

Process any of these files:

ops-payload-level1a-v0.4/2022/12/22/19/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2022/12/23/1/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2022/12/23/2/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2023/1/3/4/MATS_OPS_Level0_VC1_APID100_20230102-150007_20230103-151028.parquet 
ops-payload-level1a-v0.4/2023/1/10/23/MATS_OPS_Level0_VC1_APID100_20230110-142705_20230111-081424.parquet
ops-payload-level1a-v0.4/2023/1/11/23/MATS_OPS_Level0_VC1_APID100_20230111-151822_20230112-082136.parquet
ops-payload-level1a-v0.4/2023/1/12/22/MATS_OPS_Level0_VC1_APID100_20230112-143446_20230113-145046.parquet
ops-payload-level1a-v0.4/2023/1/14/2/MATS_OPS_Level0_VC1_APID100_20230113-145046_20230114-132321.parquet
ops-payload-level1a-v0.4/2023/2/2/0/MATS_OPS_Level0_VC1_APID100_20230201-135801_20230202-141026.parquet 

âœŒđŸœ Expected behaviour

Error should be caught before reaching the batch-processor and appropriate action taken (e.g. return error-flag)

skymandr commented 1 year ago

@OleMartinChristensen Do you agree this is something that should be fixed in your end of the code?

OleMartinChristensen commented 1 year ago

Yes, this should only happen if the files are corrupt and I dont think they are (as they would not pass through the l0 processing). So its probably an index error somewhere. Let me check

OleMartinChristensen commented 1 year ago

the problem is that temperature = NAN is not handled. I will fix this.

skymandr commented 1 year ago

the problem is that temperature = NAN is not handled. I will fix this.

Interesting thing to cause a binning problem! Not where I would have looked first – good catch!

skymandr commented 1 year ago

The problem persists for a number of files, see appended file:

binfail.txt

The problem, from our point of view, is that an Exception is explicitly thrown, but there is no code that catches it. Maybe you want to have a look at the files above, to see if you can fix the root cause, but I think the code should also handle this error.

If you like, I can refactor it a bit so that we can handle it (that is to say: skip offending images, but not the entire file) in the Lambda-code. Let me know what you think is appropriate.

donal-mur commented 1 year ago

Jag undrar om inte denna test Àr felaktig formulerad : nchunks_r=int(image.shape[0])/nbin_r nchunks_c=int(image.shape[1])/nbin_c

I binnade och beskurna bilder detta behöver inte vara sant Output exceeds the size limit. Open the full output data in a text editor {'EXP Date': Timestamp('2023-02-06 00:43:13.498855591+0000', tz='UTC'), 'File': 'MATS_OPS_Level0_VC1_APID100_20230205-125841_20230206-130603.rac', 'ProcessingTime': Timestamp('2023-02-06 13:59:31.928491590+0000', tz='UTC'), 'RamsesTime': Timestamp('2023-02-06 13:02:58.725000+0000', tz='UTC'), 'QualityIndicator': 0, 'LossFlag': 0, 'VCFrameCounter': 124, 'SPSequenceCount': 14913, 'TMHeaderTime': Timestamp('2023-02-06 00:43:18.979125977+0000', tz='UTC'), 'TMHeaderNanoseconds': 1359679416979125977, 'SID': '', 'RID': 'CCD5', 'CCDSEL': 5, 'EXP Nanoseconds': 1359679411498855591, 'WDW Mode': 'Automatic', 'WDW InputDataWindow': '11..0', 'WDWOV': 0, 'JPEGQ': 90, 'FRAME': 915, 'NROW': 162, 'NRBIN': 2, 'NRSKIP': 5, 'NCOL': 43, 'NCBIN FPGAColumns': 1, 'NCBIN CCDColumns': 40,

Image size is (162, 44) För rows dÄ Àr det ok 162 Àr delbar med 2 men för cols 44 Àr inte delbar med 40 som jag antar Àr en nbin_c

Ncol Àr den binnad image size (egentligen ncol+1) altsÄ 44 cols dÀr 40 pixlar har binnats innan utlÀsning.

Det finns inget att sÀga att det finns ett exakt förhÄllande mellan dem NCSKIP + (NCOL+1)*NCBIN CCDColumns bör vara < 2048 dÀr resten Àr det som skippas upptill

201 + 44 * 40 = 1961