apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.49k stars 1.28k forks source link

Avoid use of checking and casting to LLRealtimeSegmentDataManager in SegmentOfflineStateModelFactory #10049

Open jugomezv opened 1 year ago

jugomezv commented 1 year ago

As example take a look at this code in SegmentOnlineOfflineStateModelFactory.java::onBecomeOnlineFromConsuming()


  try {
        if (!(acquiredSegment instanceof LLRealtimeSegmentDataManager)) {
          // We found an LLC segment that is not consuming right now, must be that we already swapped it with a
          // segment that has been built. Nothing to do for this state transition.
          _logger
              .info("Segment {} not an instance of LLRealtimeSegmentDataManager. Reporting success for the transition",
                  acquiredSegment.getSegmentName());
          return;
        }
        LLRealtimeSegmentDataManager segmentDataManager = (LLRealtimeSegmentDataManager) acquiredSegment;
        SegmentZKMetadata segmentZKMetadata = ZKMetadataProvider
            .getSegmentZKMetadata(_instanceDataManager.getPropertyStore(), realtimeTableName, segmentNameStr);
        segmentDataManager.goOnlineFromConsuming(segmentZKMetadata);
      } catch (InterruptedException e) {
        String errorMessage = String.format("State transition interrupted for segment %s.", segmentNameStr);
        _logger.warn(errorMessage, e);
        tableDataManager
            .addSegmentError(segmentNameStr, new SegmentErrorInfo(System.currentTimeMillis(), errorMessage, e));
        throw new RuntimeException(e);
      } finally {
        tableDataManager.releaseSegment(acquiredSegment);
      }

The details of realtime segment data manger should be handled in that class on a method called when this transitions for all types of segment data managers. Default NOP implementation can be used for other types of segement data managers

Jackie-Jiang commented 1 year ago

This is for error handling purpose when Helix is re-deliver the message. IMO I don't want to introduce consuming state concept into the general SegmentDataManager because it doesn't apply to offline segment.

jugomezv commented 1 year ago

But wouldn't that be cleaner? I can draft the change and you can tell us what you think. Seems to me that doing instance of breaks encapsulation.