grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.74k stars 3.43k forks source link

Bloom Filters fail when using sharded Chunk Store Buckets #13575

Open diranged opened 3 months ago

diranged commented 3 months ago

Describe the bug We have a "staging" and "production" Loki environment ... and the only real fundamental difference is that our production environment uses sharded S3 buckets to handle API rate limits (we use 8 buckets). In our staging environment we can see that bloom filtering is working (via the responsiveness of the queries and the metrics), but in our production environment (with the identical loki config) bloom filters are not working for queries according to the metrics.

Staging Proof

image

Production Broken Though

image

In our "staging" environment, we can see the blooms directory being populated properly with blooms/blocks/XX/xx.gz and blooms/metas/xx.json files in the single bucket:

2024-07-16 07:01:10  156319814 bloom/tsdb_index_19918/dev/blocks/0007c9be19e064f3-814dbf6b8eb6bf71/1720913330527-1721003435604-337fdab2.tar.gz
2024-07-16 07:49:16  156509584 bloom/tsdb_index_19918/dev/blocks/0007c9be19e064f3-814dbf6b8eb6bf71/1720913330527-1721003435604-a3fa582.tar.gz
2024-07-16 07:58:18  161688835 bloom/tsdb_index_19918/dev/blocks/8158fb888043e969-ff5725e95dab9fcb/1720912945319-1721003397610-6248166b.tar.gz
2024-07-16 07:58:22     186813 bloom/tsdb_index_19918/dev/blocks/ff69a6398372717c-fff213b47582d502/1720914399175-1721002414177-6517586f.tar.gz
2024-07-16 07:58:22        722 bloom/tsdb_index_19918/dev/metas/0000000000000000-ffffffffffffffff-adf595b2.json
2024-07-16 08:11:25  137205752 bloom/tsdb_index_19918/staging/blocks/00021c1a82c1f156-10b8bd44a074fe0b/1720913391399-1721003458978-ebcbd4c9.tar.gz
2024-07-16 08:24:53  143998415 bloom/tsdb_index_19918/staging/blocks/10b93aa4bc386dc6-1e07cb941c9c091c/1720913317284-1721004216476-a198fa12.tar.gz
2024-07-16 08:35:23  141573240 bloom/tsdb_index_19918/staging/blocks/1e0b52a85c721c65-331a95045b099d44/1720913326044-1721004042676-9a0adc13.tar.gz
2024-07-16 08:45:41  143277046 bloom/tsdb_index_19918/staging/blocks/331f3703a07a4b0d-4ad636417013f3ea/1720913445231-1721003389231-7ce86bcc.tar.gz
...

However, in our Production environment where we have multiple buckets, we see that the bloom files are spread around the buckets in a seemingly random pattern.. across the 8 buckets, here's the distribution of files:

# Bucket 1
2024-07-16 12:29:41     256637 bloom/tsdb_index_19918/dev/blocks/ff69a6398372717c-fff213b47582d502/1720914399175-1721002414177-9d11b0c.tar.gz
2024-07-17 08:38:28  225074089 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-ae7c1c59.tar.gz
2024-07-17 16:58:58  226677377 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-c96f0464.tar.gz
2024-07-16 23:52:39  145837725 bloom/tsdb_index_19919/production/blocks/001f9d0fbc469812-003b324498b3869d/1720999396297-1721089878838-222d8163.tar.gz
2024-07-17 00:43:37  291619441 bloom/tsdb_index_19919/production/blocks/003b403bb3b5c51d-0040d5d05bd87b27/1721000405721-1721089548504-be44b08a.tar.gz
2024-07-18 00:16:17  160692074 bloom/tsdb_index_19920/dev/blocks/59fafb3edcee1577-b8b2dfff43f1f5e0/1721085818536-1721177014760-9df10f1a.tar.gz
2024-07-18 01:55:08  281747004 bloom/tsdb_index_19920/production/blocks/003e142198f26a29-0040d5d05bd87b27/1721087435920-1721175261219-2849c9ba.tar.gz
2024-07-18 02:57:31  256506400 bloom/tsdb_index_19920/production/blocks/0040e19b15342ec8-004dedaa5afedf77/1721085951946-1721176663547-8b1c8ebf.tar.gz
2024-07-18 03:10:03  135794114 bloom/tsdb_index_19920/production/blocks/004df8b8d047ecef-005b5b53c2b1b22a/1721086489640-1721176537358-510552b4.tar.gz
2024-07-18 03:30:24  172750187 bloom/tsdb_index_19920/production/blocks/0065da94ccb5f12d-007b03032175218b/1721085774247-1721175810489-9defa18d.tar.gz
2024-07-18 03:54:03  158342991 bloom/tsdb_index_19920/production/blocks/007b076a02ef13a6-00931430a623a0dd/1721085629882-1721176318687-840c3198.tar.gz
2024-07-17 19:33:08  157178898 bloom/tsdb_index_19920/staging/blocks/1a3d4f38a7be9420-270c7085c6f99cbd/1721086146120-1721176500588-955adaee.tar.gz
2024-07-17 20:09:42  155102265 bloom/tsdb_index_19920/staging/blocks/335da8c17afb468b-44cc59f8ad222a3e/1721085174156-1721177199646-85fcd10a.tar.gz
2024-07-17 20:21:28  149289194 bloom/tsdb_index_19920/staging/blocks/44ccba4486036351-4f93452f283ea022/1721085397532-1721176862211-c6386022.tar.gz
2024-07-17 21:21:27  153958466 bloom/tsdb_index_19920/staging/blocks/6d9af2c79d0ea97e-7afc881c7a1c0349/1721086501112-1721177282406-42475dc.tar.gz
2024-07-17 22:40:56  141447934 bloom/tsdb_index_19920/staging/blocks/b1b26859ffadf320-bc740a6a960d5a1d/1721086246407-1721176473724-1b9d4b22.tar.gz
2024-07-17 22:53:23  154703238 bloom/tsdb_index_19920/staging/blocks/bc75e7d61a75870b-c8bd6ed878ff4a65/1721086139177-1721177198937-d3399127.tar.gz

# Bucket 2
2024-07-17 16:18:37  225340160 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-3978b72e.tar.gz
2024-07-16 21:30:45  226376287 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-b5f0f14e.tar.gz
2024-07-17 03:20:47  152185350 bloom/tsdb_index_19919/production/blocks/0098609da8519676-00a9b6121b874ce1/1720999304120-1721089820700-5d3e37ce.tar.gz
2024-07-18 00:05:29  141914256 bloom/tsdb_index_19920/dev/blocks/000af8c3aa67cb6a-59f184991d29fe98/1721085658367-1721176850663-d41fd637.tar.gz
2024-07-18 00:49:09  162207087 bloom/tsdb_index_19920/production/blocks/000020abf5e6dcef-00227dca574034f7/1721085558091-1721176851718-8700c9df.tar.gz
2024-07-18 04:43:44  155670199 bloom/tsdb_index_19920/production/blocks/00931bf3707dde03-00995ad7abf8ca86/1721087397838-1721176321461-afeda900.tar.gz

# Bucket 3
2024-07-17 12:25:14  226229643 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-ae28ecec.tar.gz
2024-07-17 18:18:22  168282819 bloom/tsdb_index_19919/production/blocks/0014da0fee61de7a-001f950cf8be81bb/1721000761353-1721089869209-a18c95a1.tar.gz
2024-07-17 01:47:16  159949219 bloom/tsdb_index_19919/production/blocks/005b2ad17bc9ecd4-0078938ce6af3dad/1721000129416-1721089834022-e356a1de.tar.gz
2024-07-17 02:11:53  166107961 bloom/tsdb_index_19919/production/blocks/0078aba1a1b1660b-0098523478416d30/1720999410725-1721089119988-dfd67597.tar.gz
2024-07-17 04:01:55  122133125 bloom/tsdb_index_19919/production/blocks/00a9b84b3e014f48-00bd42138bec0c74/1721000057722-1721089254442-f20ad202.tar.gz
2024-07-18 08:05:46  171616822 bloom/tsdb_index_19920/production/blocks/00a942284731ce84-00b44b79bdc792ce/1721086805600-1721176295902-5b6d6282.tar.gz
2024-07-17 22:27:34  142155735 bloom/tsdb_index_19920/staging/blocks/a2bf393a01daa815-b1b257c4430b5b98/1721085903114-1721176866461-35195c11.tar.gz
2024-07-17 23:05:24  151477456 bloom/tsdb_index_19920/staging/blocks/c8c04c585df80048-d8988506e5be501f/1721086050059-1721176542670-cdbbc6d6.tar.gz
2024-07-17 23:19:24  147432616 bloom/tsdb_index_19920/staging/blocks/d899248ca9f30dc1-e6d24d77d731193b/1721085046098-1721176881902-87333e30.tar.gz

# Bucket 4
2024-07-16 12:29:38  162256334 bloom/tsdb_index_19918/dev/blocks/8158fb888043e969-ff5725e95dab9fcb/1720912729791-1721003397610-78890566.tar.gz
2024-07-17 14:03:53  228717499 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-cdd3e7f1.tar.gz
2024-07-16 22:09:30  229773406 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-d6290655.tar.gz
2024-07-17 15:04:26  168484660 bloom/tsdb_index_19919/production/blocks/0014da0fee61de7a-001f950cf8be81bb/1721000761353-1721089869209-c770e4c9.tar.gz
2024-07-18 00:25:32        721 bloom/tsdb_index_19920/dev/metas/0000000000000000-ffffffffffffffff-c2fafaa2.json
2024-07-18 03:19:18  152844150 bloom/tsdb_index_19920/production/blocks/005b70318b723628-0065d7014c3e4313/1721085824693-1721176296366-2ae62585.tar.gz
2024-07-17 23:52:22       3941 bloom/tsdb_index_19920/staging/metas/0000000000000000-ffffffffffffffff-cc224507.json

# Bucket 5
2024-07-16 12:29:41        721 bloom/tsdb_index_19918/dev/metas/0000000000000000-ffffffffffffffff-31db6c3.json
2024-07-18 08:46:46  154903668 bloom/tsdb_index_19920/production/blocks/00b4661cfc2abdb8-00bf0d136f913aee/1721085935235-1721176129571-93ee48d6.tar.gz
2024-07-17 18:46:31  147298508 bloom/tsdb_index_19920/staging/blocks/000004e7ca5b53be-0c39003271b5144c/1721086241000-1721177291603-739423c8.tar.gz
2024-07-17 21:50:20  168687798 bloom/tsdb_index_19920/staging/blocks/890c012f3e9d8ca0-9551f3e7e15405cc/1721085884838-1721177271466-42af4c2a.tar.gz

# Bucket 6
2024-07-16 19:33:13  227686902 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-9312390.tar.gz
2024-07-17 15:52:12  224520120 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-f3ec922e.tar.gz
2024-07-17 01:23:48  177009700 bloom/tsdb_index_19919/production/blocks/0040dbd43186fd44-005b24fe0ffea412/1720999771764-1721089361420-7aea8ce2.tar.gz
2024-07-18 00:25:32  103023904 bloom/tsdb_index_19920/dev/blocks/b8d30e8d927588ad-ffff73499274a244/1721085676424-1721177327645-67bf993.tar.gz
2024-07-17 19:06:55  158983920 bloom/tsdb_index_19920/staging/blocks/0c40b04012dee535-1a39f22db82d72d1/1721085019298-1721177162442-2a04273.tar.gz
2024-07-17 19:52:12  150405341 bloom/tsdb_index_19920/staging/blocks/270f18c9445934d1-335bc22e8c879cc6/1721085949568-1721177278600-39c69b46.tar.gz
2024-07-17 21:09:47  165328000 bloom/tsdb_index_19920/staging/blocks/5d4f2504b7b548b4-6d97639f74440c6a/1721085044825-1721176457541-98d0d10a.tar.gz
2024-07-17 22:07:37  154530680 bloom/tsdb_index_19920/staging/blocks/95522c7423f1f8b3-a2bdcdf7b4b07cad/1721085019298-1721176335892-7ec7dc74.tar.gz

# Bucket 7
2024-07-16 11:48:12  158610741 bloom/tsdb_index_19918/dev/blocks/0007c9be19e064f3-814dbf6b8eb6bf71/1720912910099-1721003435604-1ddf5f35.tar.gz
2024-07-16 11:29:33  158405329 bloom/tsdb_index_19918/dev/blocks/0007c9be19e064f3-814dbf6b8eb6bf71/1720912910099-1721003435604-55710749.tar.gz
2024-07-17 13:27:34  230979709 bloom/tsdb_index_19919/production/blocks/0000057fc36ec0f8-0014a3582a08c749/1720998824296-1721089666220-70bffa33.tar.gz
2024-07-18 07:13:31  296862801 bloom/tsdb_index_19920/production/blocks/00996ea0257a7464-00a92ffd089161ac/1721085573848-1721176202131-f4d6e419.tar.gz
2024-07-17 20:52:24  157047670 bloom/tsdb_index_19920/staging/blocks/4f96e19046641d29-5d46d9e29e1f1f51/1721085046609-1721176458453-102bdc9.tar.gz
2024-07-17 21:33:41  162541911 bloom/tsdb_index_19920/staging/blocks/7afff32ed5d1b2df-890b0a02a4717780/1721086150100-1721177306084-d81b0701.tar.gz

# Bucket 8
2024-07-16 12:18:33  157904003 bloom/tsdb_index_19918/dev/blocks/0007c9be19e064f3-814dbf6b8eb6bf71/1720912910099-1721003435604-ff30464f.tar.gz
2024-07-17 05:57:36  236242091 bloom/tsdb_index_19919/production/blocks/00bd5a6ccc3d5b18-00cac4feae5886b1/1721000203357-1721089855197-3b578d7f.tar.gz
2024-07-18 01:07:48  173716417 bloom/tsdb_index_19920/production/blocks/00228551611fec7a-003df85103f1f27a/1721085473902-1721175954872-644f2f42.tar.gz
2024-07-17 18:56:58  147039202 bloom/tsdb_index_19920/staging/blocks/000004e7ca5b53be-0c39003271b5144c/1721086241000-1721177291603-b4fb04d8.tar.gz
2024-07-17 23:34:10  178765514 bloom/tsdb_index_19920/staging/blocks/e6d2d7d10af69bc5-f5c0962900fe96ba/1721086115047-1721176858950-ce4adfec.tar.gz
2024-07-17 23:52:20  124075721 bloom/tsdb_index_19920/staging/blocks/f5c2a0a3c9052acb-ffff1e2c03010f54/1721085833198-1721177281091-86426bfa.tar.gz

Is it possible that the bloom filtering code doesn't know how to handle sharded buckets?

chaudum commented 3 months ago

Hi @diranged

Thanks for reporting. Wrt your question:

Is it possible that the bloom filtering code doesn't know how to handle sharded buckets?

Very likely yes. We haven't tested with multiple buckets yet. Even though the bucket client the bloom gateway uses should support that, it's possible that it doesn't.

I will look into this.

vladst3f commented 2 months ago

Hi @diranged

Thanks for reporting. Wrt your question:

Is it possible that the bloom filtering code doesn't know how to handle sharded buckets?

Very likely yes. We haven't tested with multiple buckets yet. Even though the bucket client the bloom gateway uses should support that, it's possible that it doesn't.

I will look into this.

Hey @chaudum, did you have the time to indeed look into it ? Cheers!

chaudum commented 2 months ago

@diranged Could you share your Loki configuration?

Do you see any errors, or is your assumption solely based on the filter metrics?


However, in our Production environment where we have multiple buckets, we see that the bloom files are spread around the buckets in a seemingly random pattern.. across the 8 buckets, here's the distribution of files:

In a multi-bucket setup, the object keys are hashed and distributed across the available buckets using modulo.

diranged commented 2 months ago

@diranged Could you share your Loki configuration?

Our configuration is quite large - are there sections you'd like to see?

Do you see any errors, or is your assumption solely based on the filter metrics?

No errors - specifically just working based on the fact that we see no metrics reported in the sharded environment, but we do see them in our single-bucket test environment.