Unidata / netcdf-java

The Unidata netcdf-java library
https://docs.unidata.ucar.edu/netcdf-java/current/userguide/index.html
BSD 3-Clause "New" or "Revised" License
144 stars 70 forks source link

Multithreaded access with both file cache and dataset cache fails #1331

Closed aaime closed 5 months ago

aaime commented 5 months ago

Versions impacted by the bug

5.5.3

What went wrong?

I have some server side code that opens NetCDF files to build maps and the like. In this code I have been experimenting using both the NetCDF dataset cache, and the random access file cache. Each thread is using its own local dataset, while normally read work fine, applying enough pressure, that is, enough parallel threads, makes the code throw errors.

I have replicated reading a large NetCDF file, with many variables, using this stand-alone class:

package testPackage;

import java.io.File;
import java.sql.Date;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.concurrent.Executor;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;

import com.google.common.collect.ImmutableList;
import ucar.ma2.Array;
import ucar.ma2.Range;
import ucar.nc2.Dimension;
import ucar.nc2.Variable;
import ucar.nc2.dataset.DatasetUrl;
import ucar.nc2.dataset.NetcdfDataset;
import ucar.nc2.dataset.NetcdfDatasets;
import ucar.nc2.units.DateRange;
import ucar.nc2.util.cache.FileCache;
import ucar.unidata.io.RandomAccessFile;

public class MultithreadedTestReader {

  static final int LOOPS = 2000;

  static class VariableInfo {
    String name;
    ImmutableList<Range> ranges;
    ImmutableList<Dimension> dimensions;

    public VariableInfo(Variable v) {
      this.name = v.getFullName();
      this.ranges = v.getRanges();
      this.dimensions = v.getDimensions();
    }
  }

  public static void main(String[] args) throws Exception {
    File file =
        new File(
                "/home/aaime/devel/gisData/radarmeteo/HSF_ITA_DATA/HFS_ITA_4KM_1H_288H_2022041200.nc")
            .getCanonicalFile();

    // comment out this line and it won't fail
    NetcdfDatasets.initNetcdfFileCache(200, 300, 12 * 60);
    RandomAccessFile.enableDefaultGlobalFileCache();
    DatasetUrl datasetUrl = DatasetUrl.findDatasetUrl(file.getAbsolutePath());
    List<VariableInfo> variables;
    try (NetcdfDataset dataset = NetcdfDatasets.acquireDataset(datasetUrl, null)) {
      System.out.println(dataset);
      variables =
          dataset.getVariables().stream()
              .filter(v -> !v.isCoordinateVariable())
              .map(v -> new VariableInfo(v))
              .collect(Collectors.toList());
    }

    ExecutorService executor = Executors.newFixedThreadPool(16);
    Random random = new Random(0);
    for (int i = 0; i < LOOPS; i++) {
      for (VariableInfo vi : variables) {
        executor.submit(
            () -> {
              try (NetcdfDataset dataset = NetcdfDatasets.acquireDataset(datasetUrl, null)) {

                List<Range> readRanges = new ArrayList<>();
                for (int d = 0; d < vi.dimensions.size(); d++) {
                  Range range = vi.ranges.get(d);
                  int first = range.first();
                  int last = range.last();
                  int min = (int) (first + random.nextFloat() * (last - first));
                  int max = (int) (min + random.nextFloat() * (last - min));
                  readRanges.add(new Range(vi.dimensions.get(d).getName(), min, max));
                }

                Variable variable = dataset.findVariable(vi.name);
                Array data = variable.read(readRanges);
                System.out.println("Read " + vi.name + " " + readRanges);
              } catch (Exception e) {
                e.printStackTrace();
              }
            });
      }
    }
    executor.awaitTermination(1, java.util.concurrent.TimeUnit.HOURS);
    RandomAccessFile.shutdown();
    FileCache.shutdown();
  }
}

While initially it works, after a little while on my 8 core machine (16 threads) it starts throwing a variety of exceptions. I have verified that a simple cure is to disable the dataset cache.

Example output

> Task :cdm:cdm-core:MultithreadedTestReader.main()
netcdf /media/backup/Backup/radarmeteo/HSF_ITA_DATA/HFS_ITA_4KM_1H_288H_2022041200.nc {
  dimensions:
    latitude = 321;
    longitude = 433;
    time = UNLIMITED;   // (289 currently)
  variables:
    float PREC_HOURLY(time=289, latitude=321, longitude=433);
      :units = "mm";
      :description = "Hourly precipitation";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SNOW_HOURLY(time=289, latitude=321, longitude=433);
      :units = "mm";
      :description = "Hourly snowfall";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float PREC_PROB_HOURLY(time=289, latitude=321, longitude=433);
      :units = "%";
      :description = "Hourly precipitation probability";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SNOW_PROB_HOURLY(time=289, latitude=321, longitude=433);
      :units = "%";
      :description = "Hourly snowfall probability";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float TC2M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly temperature at 2m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float TDC2M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly dewpoint temperature at 2m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float RH2M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "%";
      :description = "Hourly relative humidity at 2m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float QV2M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "kgkg-1";
      :description = "Hourly water vapor mixing ratio";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float PREC_TYPE_HOURLY(time=289, latitude=321, longitude=433);
      :units = "0=NoPrec 1=SN 2=WS 3=RS 4=RA 5=FR";
      :description = "Hourly precipitation type";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float WSPD10M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "ms-1";
      :description = "Hourly windspeed at 10m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float WDIR10M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "degrees";
      :description = "Hourly wind direction at 10m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float WSPDX10M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "ms-1";
      :description = "Hourly maximum windspeed at 10m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float U10M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "ms-1";
      :description = "Hourly Earth-relative rotated wind u-component at 10m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float V10M_HOURLY(time=289, latitude=321, longitude=433);
      :units = "ms-1";
      :description = "Hourly Earth-relative rotated wind v-component at 10m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float CLCOVER_HOURLY(time=289, latitude=321, longitude=433);
      :units = "0…1";
      :description = "Hourly cloud cover";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SSWDIR_HOURLY(time=289, latitude=321, longitude=433);
      :units = "Wm-2";
      :description = "Hourly direct short-wave solar radiation at surface";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SSWDIF_HOURLY(time=289, latitude=321, longitude=433);
      :units = "Wm-2";
      :description = "Hourly diffuse short-wave solar radiation at surface";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SSWTOT_HOURLY(time=289, latitude=321, longitude=433);
      :units = "Wm-2";
      :description = "Hourly global short-wave solar radiation at surface";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SMOIS_HOURLY(time=289, latitude=321, longitude=433);
      :units = "m3m-3";
      :description = "Hourly soil moisture";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SKINTEMP_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly skin temperature";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float LEAFWET_HOURLY(time=289, latitude=321, longitude=433);
      :units = "0-1";
      :description = "Hourly leaf wetness";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float MUCAPE_HOURLY(time=289, latitude=321, longitude=433);
      :units = "JKg-1";
      :description = "Hourly most-unstable cape";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SHP_HOURLY(time=289, latitude=321, longitude=433);
      :units = "none";
      :description = "Hourly significant hail parameter";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SHEAR06KM_HOURLY(time=289, latitude=321, longitude=433);
      :units = "ms-1";
      :description = "Hourly 0-6km vertical bulk shear";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float KINDEX_HOURLY(time=289, latitude=321, longitude=433);
      :units = "none";
      :description = "Hourly K-index";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float HAILPROB_HOURLY(time=289, latitude=321, longitude=433);
      :units = "%";
      :description = "Hourly hail probability";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float APP_TEMP_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly apparent temperature at 2m";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float TMAX_VERT_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly maximum vertical temperature";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float TDIFF_03_HOURLY(time=289, latitude=321, longitude=433);
      :units = "C";
      :description = "Hourly temperature difference between eta levels 0 and 3";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float ICON_CODE_HOURLY(time=289, latitude=321, longitude=433);
      :units = "0-SG 1-SN 2-N 3-PNG 4-PNN 5-MNG 6-MNN 7-RA 8-RASG 9-RASN 10-T 11-TSG 12-TSN 13-SN 14-FR 15-FG 16-RS";
      :description = "Hourly icon code";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float VISIBILITY_HOURLY(time=289, latitude=321, longitude=433);
      :units = "m";
      :description = "Hourly horizontal visibility";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float POTEVP_ACC_HOURLY(time=289, latitude=321, longitude=433);
      :units = "mm";
      :description = "Accumulated potential evaporation since model init";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float FROST_PROB_HOURLY(time=289, latitude=321, longitude=433);
      :_FillValue = -999.0f; // float
      :description = "Hourly frost probability";
      :units = "%";
      :_ChunkSizes = 1U, 40U, 40U; // uint

    float SLP_HOURLY(time=289, latitude=321, longitude=433);
      :units = "hPa";
      :description = "Hourly Sea Level Pressure";
      :_FillValue = -999.0f; // float
      :_ChunkSizes = 1U, 40U, 40U; // uint
      :_CoordinateAxisType = "Pressure";

    float latitude(latitude=321);
      :long_name = "Latitude";
      :description = "latitude";
      :units = "degrees_north";
      :_ChunkSizes = 40U; // uint
      :_CoordinateAxisType = "Lat";

    float longitude(longitude=433);
      :long_name = "Longitude";
      :description = "longitude";
      :units = "degrees_east";
      :_ChunkSizes = 40U; // uint
      :_CoordinateAxisType = "Lon";

    int time(time=289);
      :calendar = "standard";
      :long_name = "Time";
      :units = "hours since 1800-01-01 00:00:00";
      :name = "time";
      :_ChunkSizes = 1U; // uint
      :_CoordinateAxisType = "Time";

  // global attributes:
  :resolution = "0.04";
  :init = "20220412_00";
  :description = "Radarmeteo Forecast System output";
  :_CoordSysBuilder = "ucar.nc2.internal.dataset.conv.DefaultConventions";
}

Read SNOW_HOURLY [210:274, 76:223, 275:323]
Read SSWDIF_HOURLY [158:173, 191:291, 143:216]
Read CLCOVER_HOURLY [235:235, 268:289, 227:375]
Read U10M_HOURLY [118:120, 248:318, 427:430]
Read TC2M_HOURLY [281:283, 231:290, 308:315]
Read SKINTEMP_HOURLY [286:286, 282:300, 73:124]
Read WSPDX10M_HOURLY [271:273, 87:159, 55:193]
Read LEAFWET_HOURLY [277:279, 231:311, 292:332]
Read SSWDIR_HOURLY [246:249, 3:134, 299:332]
Read TDC2M_HOURLY [176:181, 57:138, 93:161]
Read KINDEX_HOURLY [20:20, 9:44, 208:209]
Read SSWTOT_HOURLY [277:283, 33:105, 270:316]
Read HAILPROB_HOURLY [279:283, 316:317, 331:403]
Read PREC_HOURLY [225:249, 302:315, 6:269]
Read SHEAR06KM_HOURLY [279:284, 112:190, 20:344]
Read SHP_HOURLY [245:265, 160:274, 428:430]
Read RH2M_HOURLY [247:285, 310:317, 374:404]
Read MUCAPE_HOURLY [231:268, 141:198, 199:221]
Read TMAX_VERT_HOURLY [244:275, 16:73, 4:87]
Read VISIBILITY_HOURLY [264:271, 88:305, 377:419]
Read SNOW_PROB_HOURLY [41:129, 148:195, 1:288]
Read QV2M_HOURLY [259:287, 42:193, 35:108]
Read FROST_PROB_HOURLY [186:218, 158:191, 218:323]
Read SLP_HOURLY [149:176, 117:193, 206:330]
Read V10M_HOURLY [222:284, 228:291, 91:107]
Read WDIR10M_HOURLY [110:219, 315:319, 379:380]
Read QV2M_HOURLY [167:167, 163:169, 134:327]
Read PREC_HOURLY [202:220, 103:162, 5:413]
Read POTEVP_ACC_HOURLY [175:236, 290:294, 19:422]
Read TC2M_HOURLY [257:280, 95:144, 185:276]
Read PREC_TYPE_HOURLY [42:215, 7:7, 236:237]
Read SNOW_HOURLY [201:267, 238:298, 262:343]
Read PREC_PROB_HOURLY [140:268, 238:318, 316:382]
Read PREC_TYPE_HOURLY [155:236, 311:311, 265:334]
Read TDC2M_HOURLY [198:232, 130:259, 152:199]
Read SSWDIR_HOURLY [170:176, 300:305, 230:238]
Read SMOIS_HOURLY [113:220, 273:274, 339:425]
Read RH2M_HOURLY [193:225, 236:307, 386:424]
Read WDIR10M_HOURLY [235:286, 237:263, 272:338]
Read SSWTOT_HOURLY [211:235, 29:95, 114:256]
Read PREC_PROB_HOURLY [67:192, 124:142, 393:406]
Read APP_TEMP_HOURLY [144:236, 81:161, 133:355]
Read SMOIS_HOURLY [54:159, 95:112, 327:340]
Read SNOW_PROB_HOURLY [48:187, 1:190, 114:132]
Read ICON_CODE_HOURLY [1:213, 98:256, 229:290]
Read KINDEX_HOURLY [272:281, 69:157, 393:407]
Read V10M_HOURLY [89:206, 169:172, 77:267]
Read WSPDX10M_HOURLY [90:246, 240:241, 82:225]
Read SHP_HOURLY [100:274, 315:315, 248:255]
Read CLCOVER_HOURLY [32:109, 218:299, 163:383]
Read VISIBILITY_HOURLY [125:142, 262:295, 261:403]
Read SHEAR06KM_HOURLY [174:241, 87:142, 221:297]
Read SSWDIF_HOURLY [61:201, 106:217, 108:171]
Read ICON_CODE_HOURLY [137:174, 29:40, 306:421]
Read PREC_HOURLY [264:275, 11:73, 84:166]
Read TDIFF_03_HOURLY [87:181, 200:214, 15:109]
Read U10M_HOURLY [35:269, 228:248, 403:422]
Read POTEVP_ACC_HOURLY [193:226, 144:224, 148:355]
Read LEAFWET_HOURLY [153:280, 27:242, 229:376]
Read SNOW_PROB_HOURLY [192:220, 137:181, 207:305]
Read TC2M_HOURLY [238:247, 172:261, 252:409]
Read MUCAPE_HOURLY [21:67, 56:311, 381:425]
java.io.IOException: Errore di input/output
    at java.base/java.io.RandomAccessFile.readBytes(Native Method)
    at java.base/java.io.RandomAccessFile.read(RandomAccessFile.java:406)
    at ucar.unidata.io.RandomAccessFile.read_(RandomAccessFile.java:738)
    at ucar.unidata.io.RandomAccessFile.readBuffer(RandomAccessFile.java:503)
    at ucar.unidata.io.RandomAccessFile.seek(RandomAccessFile.java:491)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.<init>(DataBTree.java:163)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.first(DataBTree.java:246)
    at ucar.nc2.iosp.hdf5.DataBTree$DataChunkIterator.<init>(DataBTree.java:130)
    at ucar.nc2.iosp.hdf5.DataBTree.getDataChunkIteratorFilter(DataBTree.java:67)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:97)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:222)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:200)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
    at ucar.nc2.Variable.reallyRead(Variable.java:817)
    at ucar.nc2.Variable._read(Variable.java:768)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:473)
    at ucar.nc2.dataset.VariableDS._read(VariableDS.java:448)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.Variable.read(Variable.java:574)
    at geosolutions.MultithreadedTestReader.lambda$main$2(MultithreadedTestReader.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
java.lang.IllegalStateException: DataBTree doesnt start with TREE
    at ucar.nc2.iosp.hdf5.DataBTree$Node.<init>(DataBTree.java:168)
    at ucar.nc2.iosp.hdf5.DataBTree$DataChunkIterator.<init>(DataBTree.java:128)
    at ucar.nc2.iosp.hdf5.DataBTree.getDataChunkIteratorFilter(DataBTree.java:67)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:97)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:222)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:200)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
    at ucar.nc2.Variable.reallyRead(Variable.java:817)
    at ucar.nc2.Variable._read(Variable.java:768)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:473)
    at ucar.nc2.dataset.VariableDS._read(VariableDS.java:448)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.Variable.read(Variable.java:574)
    at geosolutions.MultithreadedTestReader.lambda$main$2(MultithreadedTestReader.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
java.lang.IllegalStateException: DataBTree doesnt start with TREE
    at ucar.nc2.iosp.hdf5.DataBTree$Node.<init>(DataBTree.java:168)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.first(DataBTree.java:246)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.first(DataBTree.java:249)
    at ucar.nc2.iosp.hdf5.DataBTree$DataChunkIterator.<init>(DataBTree.java:130)
    at ucar.nc2.iosp.hdf5.DataBTree.getDataChunkIteratorFilter(DataBTree.java:67)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:97)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:222)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:200)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
    at ucar.nc2.Variable.reallyRead(Variable.java:817)
    at ucar.nc2.Variable._read(Variable.java:768)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:473)
    at ucar.nc2.dataset.VariableDS._read(VariableDS.java:448)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.Variable.read(Variable.java:574)
    at geosolutions.MultithreadedTestReader.lambda$main$2(MultithreadedTestReader.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Read TDC2M_HOURLY [172:187, 119:259, 0:257]
java.lang.IllegalArgumentException: Filtered data chunk is 2349400064 bytes and we must load it all into memory. However the maximum length of a byte array in Java is 2147483639.
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB$DataChunk.<init>(H5tiledLayoutBB.java:190)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB$DataChunkIterator.next(H5tiledLayoutBB.java:161)
    at ucar.nc2.iosp.LayoutBBTiled.hasNext(LayoutBBTiled.java:81)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.hasNext(H5tiledLayoutBB.java:128)
    at ucar.nc2.iosp.IospHelper.readData(IospHelper.java:360)
    at ucar.nc2.iosp.IospHelper.readDataFill(IospHelper.java:288)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:226)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:200)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
    at ucar.nc2.Variable.reallyRead(Variable.java:817)
    at ucar.nc2.Variable._read(Variable.java:768)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:473)
    at ucar.nc2.dataset.VariableDS._read(VariableDS.java:448)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.Variable.read(Variable.java:574)
    at geosolutions.MultithreadedTestReader.lambda$main$2(MultithreadedTestReader.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
java.lang.IllegalStateException: DataBTree doesnt start with TREE
    at ucar.nc2.iosp.hdf5.DataBTree$Node.<init>(DataBTree.java:168)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.first(DataBTree.java:246)
    at ucar.nc2.iosp.hdf5.DataBTree$Node.first(DataBTree.java:249)
    at ucar.nc2.iosp.hdf5.DataBTree$DataChunkIterator.<init>(DataBTree.java:130)
    at ucar.nc2.iosp.hdf5.DataBTree.getDataChunkIteratorFilter(DataBTree.java:67)
    at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:97)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:222)
    at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:200)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
    at ucar.nc2.Variable.reallyRead(Variable.java:817)
    at ucar.nc2.Variable._read(Variable.java:768)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.dataset.VariableDS.reallyRead(VariableDS.java:473)
    at ucar.nc2.dataset.VariableDS._read(VariableDS.java:448)
    at ucar.nc2.Variable.read(Variable.java:600)
    at ucar.nc2.Variable.read(Variable.java:574)
    at geosolutions.MultithreadedTestReader.lambda$main$2(MultithreadedTestReader.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Read SNOW_HOURLY [165:263, 308:309, 143:407]
Read FROST_PROB_HOURLY [77:194, 238:315, 89:96]
...

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed? Note: the test datasets are publicly accessible without restriction.

Yes

Code of Conduct

aaime commented 5 months ago

And now a bit of analysis of the issue. As far as I can tell, the issue is due to the following:

tdrwenski commented 5 months ago

Hi @aaime, thanks for all the info. Would you mind testing again with the 5.5.4-SNAPSHOT version to see if the issue is resolved? It sounds similar to a race condition issue we fixed since the 5.5.3 release. You can download the latest snapshot here

aaime commented 5 months ago

Hi @tdrwenski yes, moving to the main-5.x snapshot branch helps with the synthetic program I've included in this report. Monday I will try with the full application.

Wondering, when is 5.5.4 going to be released? Can I do anything to help?

tdrwenski commented 5 months ago

Great! We hope to release 5.5.4 soon, but one more fix needs to go in so we can't give a concrete timeline for it.

aaime commented 5 months ago

Confirmed it does not break with the full application either. Looking forward to the release!