prometheusMetrics segmentsDone is less than segmentsTotal by 1 & no metric updated

I have two issues about the prometheusMetrics with Cassandra 3.11.3 and cassandra-reaper 2.2.3

segmentsDone won't be updated at the last segment, so it's less than segmentsTotal by one.

I found it since V1.4.1
For example, the repair finished the total 1029 segments, but the metric segmentsDone is stuck at 1028

io_cassandrareaper_service_RepairRunner_segmentsTotal_qleartest_0f94b5c09fe611ea95c2f3a31e65ab00 1029.0 io_cassandrareaper_service_RepairRunner_segmentsTotal_qleartest_WafercloudData_0f94b5c09fe611ea95c2f3a31e65ab00 1029.0 io_cassandrareaper_service_RepairRunner_segmentsDone_qleartest_0f94b5c09fe611ea95c2f3a31e65ab00 1028.0 io_cassandrareaper_service_RepairRunner_segmentsDone_qleartest_WafercloudData_0f94b5c09fe611ea95c2f3a31e65ab00 1028.0 io_cassandrareaper_service_RepairRunner_repairProgress_qleartest_0f94b5c09fe611ea95c2f3a31e65ab00 0.999028205871582 io_cassandrareaper_service_RepairRunner_repairProgress_qleartest_WafercloudData_0f94b5c09fe611ea95c2f3a31e65ab00 0.999028205871582

There is no new metrics for the new repair

I found this issue since V2.0.4
The new metrics segmentsTotal, segmentsDone & repairProgress for the new run of repair will show up only when I restarted cassandra-reaper.

My environment

Cassandra: V3.11.3
cassandra-reaper: V2.2.3
- Cassnadra backend DB
- Configuration file

# See a bit more complete example in:
# src/server/src/test/resources/cassandra-reaper.yaml
segmentCount: 200
repairParallelism: PARALLEL
repairIntensity: 0.9
scheduleDaysBetween: 7
repairRunThreadCount: 15
hangingRepairTimeoutMins: 60
storageType: cassandra
enableCrossOrigin: true
incrementalRepair: false
enableDynamicSeedList: true
repairManagerSchedulingIntervalSeconds: 10
activateQueryLogger: false
jmxConnectionTimeoutInSeconds: 5
useAddressTranslator: false

# datacenterAvailability has three possible values: ALL | LOCAL | EACH
# the correct value to use depends on whether jmx ports to C* nodes in remote datacenters are accessible
# If the reaper has access to all node jmx ports, across all datacenters, then configure to ALL.
# If jmx access is only available to nodes in the same datacenter as reaper in running in, then configure to LOCAL.
# If there's a reaper instance running in every datacenter, and it's important that nodes under duress are not involved in repairs,
#    then configure to EACH.
#
# The default is ALL
datacenterAvailability: ALL

#jmxPorts:

#jmxAuth:
#  username: myUsername
#  password: myPassword

logging:
  level: INFO
  loggers:
    com.datastax.driver.core.QueryLogger.NORMAL:
      level: DEBUG
      additive: false
      appenders:
        - type: file
          currentLogFilename: /var/log/cassandra-reaper/query-logger.log
          archivedLogFilenamePattern: /var/log/cassandra-reaper/query-logger-%d.log.gz
          archivedFileCount: 10
    io.dropwizard: WARN
    org.eclipse.jetty: WARN
  appenders:
    - type: console
      logFormat: "%-6level [%d] [%t] %logger{5} - %msg %n"
      threshold: WARN
    - type: file
      logFormat: "%-6level [%d] [%t] %logger{5} - %msg %n"
      currentLogFilename: /var/log/cassandra-reaper/reaper.log
      archivedLogFilenamePattern: /var/log/cassandra-reaper/reaper-%d.log.gz
      archivedFileCount: 30

server:
  type: default
  applicationConnectors:
    - type: http
      port: 8080
      bindHost: 0.0.0.0
  adminConnectors:
    - type: http
      port: 8081
      bindHost: 0.0.0.0
  requestLog:
    appenders: []

cassandra:
  clusterName: "QlearTest"
  contactPoints: ["db1","db2","db3"]
  keyspace: reaper_db
  loadBalancingPolicy:
    type: tokenAware
    shuffleReplicas: true
    subPolicy:
      type: dcAwareRoundRobin
      localDC:
      usedHostsPerRemoteDC: 0
      allowRemoteDCsForLocalConsistencyLevel: false

autoScheduling:
  enabled: false
  initialDelayPeriod: PT15S
  periodBetweenPolls: PT10M
  timeBeforeFirstSchedule: PT5M
  scheduleSpreadPeriod: PT6H
  excludedKeyspaces:
    - keyspace1
    - keyspace2

# Uncomment the following to enable dropwizard metrics
#  Configure to the reporter of your choice
#  Reaper also provides prometheus metrics on the admin port at /prometheusMetrics

#metrics:
#  frequency: 1 minute
#  reporters:
#    - type: log
#      logger: metrics

Repair schedule

Please advise. Maybe I missed something

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: REAP-110

thelastpickle / cassandra-reaper

prometheusMetrics segmentsDone is less than segmentsTotal by 1 & no metric updated #1051