skymandr commented 1 year ago

🐛 Description

L1b-processing works now, at least in the operational sense: QA needs to be performed, but the pipeline works. A few files are unprocessable at the moment, for the following reason:

the size of the image and the binning size are incompatible

This is raised here: https://github.com/innosat-mats/MATS-L1-processing/blob/84d06517eab86ab84c1bcbe8b66d217793560641/src/mats_l1_processing/L1_calibration_functions.py#L625-L633

First, two comments on this:

it would be nice if the error messages were different, so that you can tell whether rows or cols are the problem
consider adding the sizes to the error-message.

Now the problem: The way processing is implemented, if an uncaught error occurs, the whole file is failed and no calibrated images will be stored. This is intentional, since we want to make sure that we don't store garbage, and we want to have the opportunity to catch and fix bugs. Failed files can be re-run once the issue is fixed. (Each file contains just 1 hour of data now, so hopefully this delay is acceptable.)

We (@innosat-mats/molflow) could make an exception here, and continue processing the file in the hope that some images are processable, but to me this looks like something that should be caught in the processing step, before it reaches our code, and an appropriate error-flag returned. My suggestion is therefore that you (@innosat-mats/scientific) implement error handling for this in the primary code-base.

🔢 To reproduce

Process any of these files:

ops-payload-level1a-v0.4/2022/12/22/19/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2022/12/23/1/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2022/12/23/2/MATS_OPS_Level0_VC1_APID100_20221222-133247_20221223-134152.parquet
ops-payload-level1a-v0.4/2023/1/3/4/MATS_OPS_Level0_VC1_APID100_20230102-150007_20230103-151028.parquet 
ops-payload-level1a-v0.4/2023/1/10/23/MATS_OPS_Level0_VC1_APID100_20230110-142705_20230111-081424.parquet
ops-payload-level1a-v0.4/2023/1/11/23/MATS_OPS_Level0_VC1_APID100_20230111-151822_20230112-082136.parquet
ops-payload-level1a-v0.4/2023/1/12/22/MATS_OPS_Level0_VC1_APID100_20230112-143446_20230113-145046.parquet
ops-payload-level1a-v0.4/2023/1/14/2/MATS_OPS_Level0_VC1_APID100_20230113-145046_20230114-132321.parquet
ops-payload-level1a-v0.4/2023/2/2/0/MATS_OPS_Level0_VC1_APID100_20230201-135801_20230202-141026.parquet

✌🏽 Expected behaviour

Error should be caught before reaching the batch-processor and appropriate action taken (e.g. return error-flag)

skymandr commented 1 year ago

@OleMartinChristensen Do you agree this is something that should be fixed in your end of the code?

OleMartinChristensen commented 1 year ago

Yes, this should only happen if the files are corrupt and I dont think they are (as they would not pass through the l0 processing). So its probably an index error somewhere. Let me check

OleMartinChristensen commented 1 year ago

the problem is that temperature = NAN is not handled. I will fix this.

skymandr commented 1 year ago

the problem is that temperature = NAN is not handled. I will fix this.

Interesting thing to cause a binning problem! Not where I would have looked first – good catch!

skymandr commented 1 year ago

The problem persists for a number of files, see appended file:

binfail.txt

The problem, from our point of view, is that an Exception is explicitly thrown, but there is no code that catches it. Maybe you want to have a look at the files above, to see if you can fix the root cause, but I think the code should also handle this error.

If you like, I can refactor it a bit so that we can handle it (that is to say: skip offending images, but not the entire file) in the Lambda-code. Let me know what you think is appropriate.

donal-mur commented 1 year ago

Jag undrar om inte denna test är felaktig formulerad : nchunks_r=int(image.shape[0])/nbin_r nchunks_c=int(image.shape[1])/nbin_c

I binnade och beskurna bilder detta behöver inte vara sant Output exceeds the size limit. Open the full output data in a text editor {'EXP Date': Timestamp('2023-02-06 00:43:13.498855591+0000', tz='UTC'), 'File': 'MATS_OPS_Level0_VC1_APID100_20230205-125841_20230206-130603.rac', 'ProcessingTime': Timestamp('2023-02-06 13:59:31.928491590+0000', tz='UTC'), 'RamsesTime': Timestamp('2023-02-06 13:02:58.725000+0000', tz='UTC'), 'QualityIndicator': 0, 'LossFlag': 0, 'VCFrameCounter': 124, 'SPSequenceCount': 14913, 'TMHeaderTime': Timestamp('2023-02-06 00:43:18.979125977+0000', tz='UTC'), 'TMHeaderNanoseconds': 1359679416979125977, 'SID': '', 'RID': 'CCD5', 'CCDSEL': 5, 'EXP Nanoseconds': 1359679411498855591, 'WDW Mode': 'Automatic', 'WDW InputDataWindow': '11..0', 'WDWOV': 0, 'JPEGQ': 90, 'FRAME': 915, 'NROW': 162, 'NRBIN': 2, 'NRSKIP': 5, 'NCOL': 43, 'NCBIN FPGAColumns': 1, 'NCBIN CCDColumns': 40,

Image size is (162, 44) För rows då är det ok 162 är delbar med 2 men för cols 44 är inte delbar med 40 som jag antar är en nbin_c

Ncol är den binnad image size (egentligen ncol+1) altså 44 cols där 40 pixlar har binnats innan utläsning.

Det finns inget att säga att det finns ett exakt förhållande mellan dem NCSKIP + (NCOL+1)*NCBIN CCDColumns bör vara < 2048 där resten är det som skippas upptill

201 + 44 * 40 = 1961

innosat-mats / MATS-L1-processing

Processing fails because of wrong binning size #102

🐛 Description

🔢 To reproduce

✌🏽 Expected behaviour