COPRS / rs-issues

This repository contains all the issues of the COPRS project (Scrum tickets, ivv bugs, epics ...)
2 stars 2 forks source link

[BUG] [OPS] S3 OL1 NTC error /usr/local/components/S3IPF_OL1_06.13/bin/OL1.bin failed due to Frame fell outside current block range #1022

Closed suberti-ads closed 1 year ago

suberti-ads commented 1 year ago

Environment: Delivery tag: Platform: OPS Orange Cloud Configuration: processing common 1.13-2-rc1 processing S3-OL1-NTC : 1.13-2-rc1

Traceability:

Current Behavior: Many error found (14)

[code 290] [exitCode 134] [msg Task /usr/local/components/S3IPF_OL1_06.13/bin/OL1.bin failed]

Expected Behavior: S3-OL1-NTC execution successfully works

Steps To Reproduce: It seems (see bellow) that we encounter this issue when a product has been generated twice. So to reproduce it, we should start twice OL1 NRT production.

Test execution artefacts (i.e. logs, screenshots…) Hereafter complete log for JobOrder.98204.xml execution Explore-logs-2023-06-22 17_27_05.txt

Whenever possible, first analysis of the root cause

Hereafter input sample for error:

JobOrder.98183.xml ==> S3A_OL_0_EFR____20230603T215910_20230603T224335_20230605T063932_2665_099_286______LN3_O_NR_002.SEN3 - S3A_OL_0_EFR____20230603T215910_20230603T224335_20230604T012546_2665_099_286______LN3_O_NR_002.SEN3
JobOrder.98189.xml ==> S3A_OL_0_EFR____20230507T201951_20230507T210413_20230512T153916_2662_098_285______LN3_O_NR_002.SEN3 - S3A_OL_0_EFR____20230507T201951_20230507T210413_20230515T155926_2662_098_285______LN3_O_NR_002.SEN3
JobOrder.98191.xml ==> S3A_OL_0_EFR____20230514T185701_20230514T194124_20230515T154950_2663_098_384______LN3_O_NR_002.SEN3 - S3A_OL_0_EFR____20230514T185701_20230514T194124_20230514T215511_2663_098_384______LN3_O_NR_002.SEN3
JobOrder.98192.xml ==> S3A_OL_0_EFR____20230514T203759_20230514T212223_20230515T160027_2664_098_385______LN3_O_NR_002.SEN3 - S3A_OL_0_EFR____20230514T203759_20230514T212223_20230514T233235_2664_098_385______LN3_O_NR_002.SEN3

We saw two input for each execution failed. It seems, we should update tasktable configuration to ingest lates product.

Bug Generic Definition of Ready (DoR)

Bug Generic Definition of Done (DoD)

LAQU156 commented 1 year ago

IVV_CCB_2023_w27 : Moved into "Accepted Werum" to update the configuration, @suberti-ads please add the param to update, Priority minor, to be fixed phase 1

suberti-ads commented 1 year ago

Proposition to solve this issue is to update to tasktable to get latest product.

Hereafter current configuration in TaskTable_S3A_OL1_06_13.xml and TaskTable_S3B_OL1_06_13.xml for OL_0_EFR in input:

                        <List_of_Inputs count="16">
                            <Input>
                                <Mode>ALWAYS</Mode>
                                <Mandatory>Yes</Mandatory>
                                <List_of_Alternatives count="1">
                                    <!-- OLCI L0 EFR product -->
                                    <Alternative>
                                        <Order>1</Order>
                                        <Origin>DB</Origin>
                                        <Retrieval_Mode>ValIntersect</Retrieval_Mode>
                                        <T0>8.8</T0>
                                        <T1>8.8</T1>
                                        <File_Type>OL_0_EFR___</File_Type>
                                        <File_Name_Type>Physical</File_Name_Type>
                                    </Alternative>
                                </List_of_Alternatives>
                            </Input>

Workaround is to replace ValIntersect by LatestValIntersect

                        <List_of_Inputs count="16">
                            <Input>
                                <Mode>ALWAYS</Mode>
                                <Mandatory>Yes</Mandatory>
                                <List_of_Alternatives count="1">
                                    <!-- OLCI L0 EFR product -->
                                    <Alternative>
                                        <Order>1</Order>
                                        <Origin>DB</Origin>
                                        <Retrieval_Mode>LatestValIntersect</Retrieval_Mode>
                                        <T0>8.8</T0>
                                        <T1>8.8</T1>
                                        <File_Type>OL_0_EFR___</File_Type>
                                        <File_Name_Type>Physical</File_Name_Type>
                                    </Alternative>
                                </List_of_Alternatives>
                            </Input>

This WA will be test next week.

suberti-ads commented 1 year ago

https://github.com/COPRS/rs-config/pull/285 created

LAQU156 commented 1 year ago

Werum_CCB_2023_w27 : Waiting for OPS tests feedback, stays into "New issues" and will be reviewed during next CCB

w-jka commented 1 year ago

@suberti-ads I don't think we should alter the selection policies of the tasktables in the default configuration. The modifications we performed so far on the tasktables were due to different XML flavours between the mission or the Change of behaviour regarding parallel execution in the SL chain.

If the workaround works I would advice to keep it in the ops config but don't transfer it into the default configuration. Changes to the selection policies should be normally performed by the IPF provider as they are the most familiar with the internal functions of the processor.

LAQU156 commented 1 year ago

Werum_CCB_2023_w28 : Moved into "Product Backlog" to see if a fix can be provided without change the tasktable

w-jka commented 1 year ago

@suberti-ads A change of the tasktable selection policy would result in a problem, as the interval defined by the tasktable might not be covered by only one product.

One way I could see to help with this issue is to update the configuration of the ol1-ntc chain as follows:

Before:

app.preparation-worker.s3-type-adapter.mpc-search.S3A_OL1.product-types=TM_0_NAT___
app.preparation-worker.s3-type-adapter.mpc-search.S3A_OL1.gap-threshold=3.0
app.preparation-worker.s3-type-adapter.mpc-search.S3B_OL1.product-types=TM_0_NAT___
app.preparation-worker.s3-type-adapter.mpc-search.S3B_OL1.gap-threshold=3.0
app.housekeep.s3-type-adapter.mpc-search.S3A_OL1.product-types=TM_0_NAT___
app.housekeep.s3-type-adapter.mpc-search.S3A_OL1.gap-threshold=3.0
app.housekeep.s3-type-adapter.mpc-search.S3B_OL1.product-types=TM_0_NAT___
app.housekeep.s3-type-adapter.mpc-search.S3B_OL1.gap-threshold=3.0

After:

app.preparation-worker.s3-type-adapter.mpc-search.S3A_OL1.product-types=OL_0_EFR___
app.preparation-worker.s3-type-adapter.mpc-search.S3A_OL1.gap-threshold=3.0
app.preparation-worker.s3-type-adapter.mpc-search.S3B_OL1.product-types=OL_0_EFR___
app.preparation-worker.s3-type-adapter.mpc-search.S3B_OL1.gap-threshold=3.0
app.housekeep.s3-type-adapter.mpc-search.S3A_OL1.product-types=OL_0_EFR___
app.housekeep.s3-type-adapter.mpc-search.S3A_OL1.gap-threshold=3.0
app.housekeep.s3-type-adapter.mpc-search.S3B_OL1.product-types=OL_0_EFR___
app.housekeep.s3-type-adapter.mpc-search.S3B_OL1.gap-threshold=3.0

The mpc-search has an integrated filter, that eliminates duplicated time intervals from the input list. You could test this configuration with the failed job, in order to see, if this configuration update fixes this issue.

pcuq-ads commented 1 year ago

SYS_CCB_w29 : Pull request https://github.com/COPRS/rs-config/pull/291 has been created to test proposed configuration in OPS.

vgava-ads commented 1 year ago

SYS_CCB_2023_w30 : Epic of COPRS Phase I (RS SW V2.0) .Closed.