filodb / FiloDB

Distributed Prometheus time series database
Apache License 2.0
1.43k stars 225 forks source link

fix(core): Don't index part keys with invalid schema #1870

Closed rfairfax closed 3 days ago

rfairfax commented 3 days ago

When bootstrapping the raw index we skip over tracking items with invalid schemas, signified by partId = -1. However, today we still index them which can create query errors later on like the following:

java.lang.IllegalStateException: This shouldn't happen since every document should have a partIdDv
    at filodb.core.memstore.PartIdCollector.collect(PartKeyLuceneIndex.scala:963)
    at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:305)
    at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:247)
    at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:776)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:551)
    at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1(PartKeyLuceneIndex.scala:635)
    at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1$adapted(PartKeyLuceneIndex.scala:635)
    at filodb.core.memstore.PartKeyLuceneIndex.withNewSearcher(PartKeyLuceneIndex.scala:279)
    at filodb.core.memstore.PartKeyLuceneIndex.searchFromFilters(PartKeyLuceneIndex.scala:635)
    at filodb.core.memstore.PartKeyLuceneIndex.partIdsFromFilters(PartKeyLuceneIndex.scala:591)
    at filodb.core.memstore.TimeSeriesShard.labelValuesWithFilters(TimeSeriesShard.scala:1782)

This fix ensures that we don't index part keys we skip during bootstrap so that the in memory shard and index are consistent with each other.

Pull Request checklist

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)

Invalid schema items end up in index

New behavior :

Invalid schema items are skipped