apache / carbondata

High performance data store solution
carbondata.apache.org
Apache License 2.0
1.43k stars 704 forks source link

carbonReader read incomplete data #4275

Open Black-max12138 opened 2 years ago

Black-max12138 commented 2 years ago

Look at this line of code. boolean hasNext = currentReader.nextKeyValue(); If hasNext returns false and currentReader is not the last one, it indicates that the iterator exits and subsequent data is not parsed. How to solve this problem?

/**

   * Return true if has next row
   */
  public boolean hasNext() throws IOException, InterruptedException {
    if (0 == readers.size() || currentReader == null) {
      return false;
    }
    validateReader();
    if (currentReader.nextKeyValue()) {
      return true;
    } else {
      if (index == readers.size() - 1) {
        // no more readers
        return false;
      } else {
        // current reader is closed
        currentReader.close();
        // no need to keep a reference to CarbonVectorizedRecordReader,
        // until all the readers are processed.
        // If readers count is very high,
        // we get OOM as GC not happened for any of the content in CarbonVectorizedRecordReader
        readers.set(index, null);
        index++;
        currentReader = readers.get(index);
        boolean hasNext = currentReader.nextKeyValue();
        if (hasNext) {
          return true;
        }
      }
    }
    return false;
  }