ACCESS-NRI / accessdev-Trac-archive

Archive accessdev Trac contents as issues
Apache License 2.0
0 stars 0 forks source link

Integer overflow in ODB version 1.0.0 (Met Office) #289

Open penguian opened 8 years ago

penguian commented 8 years ago

keyword_ODB_ODB_API_integer_overflow_odb_migrator | by jtl548@nci.org.au


The version of ODB software we are using for APS3 global is ODB 1.0.0 which we received from UKMO (see #217 for its importation into access-svn repos and #215 for its building). This ODB version has a bug which causes an overflow in certain integer variables which are used in the calculation of memory allocation. This problem was discovered while running odb_migrator. Other ODB and ODB API tools may suffer from the same bug.

The ODB software we use comes from UKMO (who received it from ECMWF). We decided not to use a new build with the bug fixed but instead put in a workaround at the suite level as the task of maintaining compatibility with UKMO software would prove to be harder if we decided to use a modified version of ODB software. However, we anticipate that we will be in correspondence with UKMO and ECMWF to get this bug fix included in their ODB software.


Issue migrated from trac:289 at 2024-01-31 18:26:03 +1100

penguian commented 8 years ago

@jin.lee@bom.gov.au commented


I put in the bug fix in this branch, https://access-svn.nci.org.au/trac/odb/browser/branches/dev/jtl548/odb/r116_Odb-1.0.0-Source-meto_nci_integer_size

Code change required to fix the bug is [119:121]

penguian commented 8 years ago

@jin.lee@bom.gov.au commented


Hi Milton,

Would you be able to review the ticket, please? Once that's done I'll close the ticket. Thanks.

Jin

penguian commented 8 years ago

@jin.lee@bom.gov.au commented


I used following steps to build the new ODB which has the integer-overflow-bug fix and test the build:

  1. I built the new ODB following the instructions in https://accessdev.nci.org.au/trac/ticket/215
  2. I also built a new ODB API following the instructions in https://accessdev.nci.org.au/trac/ticket/216
  3. Set-up for testing the new build:
    • I tried to reproduce the way the PS38 global suite runs the ODB-to-ODB2 task by creating a test directory, raijin3:/home/548/jtl548/da/ops/odb/test_odb_migrator/data/glu_ops_odb_to_odb2_atms.16_pools/atms.out.odb.5.0
    • I used the test script, raijin3:/home/548/jtl548/da/ops/odb/test_odb_migrator/scripts/test_odb_migrator.bash
  4. For ATMS obsgroup and for 16 pools each pool is still too big and so the integer over-flow occurs when running odb_migrator. To run this case,
    • Under the test directory mentioned above I soft-linked the input ODB database, ECMA.atms.out.odb.5.0 to /home/548/jtl548/da/ops/odb/test_odb_migrator/data/glu_odb/atms/16_pools
    • I modified the script test_odb_migrator.bash so that it used the standard ODB and ODB API: module load odb/1.0.0 and module load odbapi/0.10.3
    • I made sure that the script failed
  5. To run the new build on this 16-pool ODB database,
    • I modified the script test_odb_migrator.bash so that it used the new ODB and ODB API builds: i.e. module unload odb and module unload odbapi I set up ODB and ODB API environments by explicitly exporting necessary environment variables. See raijin:~jtl548/odb/scripts/odb.ksh and raijin:~jtl548/odbapi/scripts/odbapi.ksh for examples of what environment variables need to be exported
    • the script succeeded and odb_migrator did not encounter the integer overflow problem