m3db / m3

M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform
https://m3db.io/
Apache License 2.0
4.75k stars 453 forks source link

Using m3query to query data that is not currently written to disk, the results are randomly missing some series #2057

Open xiangyi-yang opened 4 years ago

xiangyi-yang commented 4 years ago

Querying the latest data that has not been written to disk has a higher probability of missing data. If the query time span is part of the current block and part of the query has been written to the disk, the lack of data rarely occurs.

calvin1978 commented 4 years ago

We meet the same problem, my block size is 6 hrs, disk will be flushed at 2:00, 8:00, 14:00 and 20:00.

At 19:50, when we querying 1hr data, the series missing randomly. When we querying 6hr data, the problem still there, but when we querying 7 hour data, everything is ok!!!

At 20:10, everything is ok too, even we query only 1 hour data.

This case happen so often , so it blocked my project to be used in production env.

Is this the common case , or just some configration problem? thank you.

arnikola commented 4 years ago

Would you mind sharing your namespace and coordinator configs (Also if you're using graphite style metrics or Prom style)?

Does this only work with m3query, or do you also see the issue when you use Prometheus remote read?

xiangyi-yang commented 4 years ago

Both access via m3query and promethues are missing some series.The configuration is as follows:

namespace:

            "app_1m": {
                "bootstrapEnabled": true,
                "cleanupEnabled": true,
                "coldWritesEnabled": false,
                "flushEnabled": true,
                "indexOptions": {
                    "blockSizeNanos": "21600000000000",
                    "enabled": true
                },
                "repairEnabled": false,
                "retentionOptions": {
                    "blockDataExpiry": true,
                    "blockDataExpiryAfterNotAccessPeriodNanos": "300000000000",
                    "blockSizeNanos": "21600000000000",
                    "bufferFutureNanos": "600000000000",
                    "bufferPastNanos": "1200000000000",
                    "futureRetentionPeriodNanos": "0",
                    "retentionPeriodNanos": "64368000000000000"
                },
                "schemaOptions": null,
                "snapshotEnabled": true,
                "writesToCommitLog": true
            }

coordinator:

listenAddress:
  type: "config"
  value: "0.0.0.0:17201"
metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:17202
  sanitization: prometheus
  samplingRate: 1.0
  extended: none
clusters:
   - namespaces:
       - namespace: app_1m
         retention: 17880h
         type: unaggregated
     client:
       config:
         service:
           env: default_env
           zone: embedded
           service: m3db
           cacheDir: /apps/dat/m3db/m3kv_coordinator
           etcdClusters:
             - zone: embedded
               endpoints:
                 - x.x.x.x:2379
                 - y.y.y.y:2379
                 - z.z.z.z:2379
       writeConsistencyLevel: majority
       readConsistencyLevel: unstrict_majority
       writeTimeout: 10s
       fetchTimeout: 15s
       connectTimeout: 20s
       writeRetry:
         initialBackoff: 500ms
         backoffFactor: 3
         maxRetries: 2
         jitter: true
       fetchRetry:
         initialBackoff: 500ms
         backoffFactor: 2
         maxRetries: 3
         jitter: true
       backgroundHealthCheckFailLimit: 4
       backgroundHealthCheckFailThrottleFactor: 0.5
tagOptions:
  idScheme: quoted

query:

listenAddress:
  type: "config"
  value: "0.0.0.0:17203"
metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:17204
  sanitization: prometheus
  samplingRate: 1.0
  extended: none
tagOptions:
  idScheme: quoted

#limits:
#  perQuery:
#    maxFetchedSeries: 500

clusters:
  - namespaces:
      - namespace: app_1m
        type: unaggregated
        retention: 17880h
    client:
      config:
        service:
          env: default_env
          zone: embedded
          service: m3db
          cacheDir: /apps/dat/m3db/m3kv
          etcdClusters:
            - zone: embedded
              endpoints:
                 - x.x.x.x:2379
                 - y.y.y.y:2379
                 - z.z.z.z:2379
      writeConsistencyLevel: majority
      readConsistencyLevel: all
      writeTimeout: 10s
      fetchTimeout: 30s
      connectTimeout: 20s
      writeRetry:
        initialBackoff: 500ms
        backoffFactor: 3
        maxRetries: 2
        jitter: true
      fetchRetry:
        initialBackoff: 500ms
        backoffFactor: 2
        maxRetries: 3
        jitter: true
      backgroundHealthCheckFailLimit: 4
      backgroundHealthCheckFailThrottleFactor: 0.5

dbnode:

db:
  logging:
    level: info
  metrics:
    prometheus:
      handlerPath: /metrics
    sanitization: prometheus
    samplingRate: 1.0
    extended: detailed
  hostID:
    # resolver: environment
    # envVarName: M3DB_HOST_ID
    resolver: hostname
    envVarName: M3DB-DBNODE-A-002
# Fill-out the following and un-comment before using.
  config:
    service:
      env: default_env
      zone: embedded
      service: m3db
      cacheDir: /apps/dat/m3db/m3kv
      etcdClusters:
        - zone: embedded
          endpoints:
            - x.x.x.x:2379
            - y.y.y.y:2379
            - z.z.z.z:2379
  listenAddress: 0.0.0.0:9000
  clusterListenAddress: 0.0.0.0:9001
  httpNodeListenAddress: 0.0.0.0:9002
  httpClusterListenAddress: 0.0.0.0:9003
  debugListenAddress: 0.0.0.0:9004
  client:
    writeConsistencyLevel: majority
    readConsistencyLevel: majority
  gcPercentage: 80
  writeNewSeriesAsync: true
  writeNewSeriesLimitPerSecond: 1048576
  writeNewSeriesBackoffDuration: 2ms
  bootstrap:
    bootstrappers:
        - filesystem
        - peers
        - commitlog
        - uninitialized_topology
    fs:
      numProcessorsPerCPU: 0.125
    commitlog:
      returnUnfulfilledForCorruptCommitLogFiles: false
  cache:
    series:
      policy: lru
    postingsList:
      size: 262144
  commitlog:
    flushMaxBytes: 524288
    flushEvery: 5s
    queue:
      calculationType: fixed
      size: 2097152
  fs:
    filePathPrefix: /apps/dat/m3db
arnikola commented 4 years ago

You're probably hitting the limit on index reads when you query.

Check for the M3-Results-Limited response header, it's not super surprising considering the size of your unaggregated namespace, which we recommend to be at most 48 hours

arnikola commented 4 years ago

You can also increase the limit in either the configs or by adding &limit=XXXX to your query string, but this may cause your cluster to OOM

xiangyi-yang commented 4 years ago

The query series is about 100, the query time span is 1 hour, and the granularity is 1 minute. The limit should not be reached. The data required for the query time period is only missing when the data is in the buffer. If the query time period contains data that has been flushed to disk, the series will not be missing.

xiangyi-yang commented 4 years ago

Our found that the reason for the lack of data was because the indexing of new data was too slow. The "record the end to end indexing latency" step affected the speed of the index. Commenting out this code returned to normal.

m3/src/dbnode/storage/index.go

    // record the end to end indexing latency
    now := i.nowFn()
    for idx := range pending {
        took := now.Sub(pending[idx].EnqueuedAt)
        i.metrics.InsertEndToEndLatency.Record(took)
    }
ning1875 commented 4 years ago

// record the end to end indexing latency

just because summary type metric has a alient performance problem ? Observations are expensive due to the streaming quantile calculation ? same to prometheus summary ? https://prometheus.io/docs/practices/histograms/