Open splitice opened 2 years ago
Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.
Stalebots are also emotionless and cruel and can close issues which are still very relevant.
If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.
We regularly sort for closed issues which have a stale
label sorted by thumbs up.
We may also:
revivable
if we think it's a valid issue but isn't something we are likely
to prioritize in the future (the issue will still remain closed).keepalive
label to silence the stalebot if the issue is very common/popular/important.We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.
stalebot persona non grata
fake
is the name of the tenant when auth is not enabled, when auth is enabled the folder name would be whatever the tenant ID is.
It's an unfortunate naming choice I'm afraid as it confuses and annoys many folks. It's just hard to change without breaking existing installs but we will likely change it in the next major release and follow the path Mimir took where they change the default and then add a config that people can set it back to 'fake' to work with existing data.
the v12
chunk schema does add more layers to the storage, the stream hash is now a folder.
This helps a lot with the per prefix rate limits but doesn't help with a human trying to do anything directly with the stored chunks.
I think it's a pretty reasonable request we've talked about before as well to include some date information in the path, it would enable at least some level of manual operation on the chunks data.
(I hijacked your title a little to steer discussions around doing another schema entry which has date information in it)
I'm trying to recover chunks from S3 DeepArchive and I'm finding it very hard. I ought to search for chunks for modified date using aws api cli and it usually takes 13 minutes with a 10 days logs database. Is there another way doing it?
time aws s3api list-objects-v2 --bucket <bucket> --query 'Contents[?contains(LastModified, "2022-07-26")].Key' --prefix "fake/" --profile <aws_profile>
...
real 13m24,594s
user 2m8,620s
sys 0m3,483s
Is your feature request related to a problem? Please describe.
Currently if something ever goes wrong with loki it's incredibly difficult to clear out old unreferenced chunks.
The folder
fake
which contains the cunks is millions of entries and near impossible to work with with s3 APIsDescribe the solution you'd like
I'd like the option to specify a folder (instead of
fake
). And for that path to be a date compatible formatting string.A date compatible formatting string would allow for simple cleanup after given retention period elapses.
Describe alternatives you've considered
Listing all 10M+ files in the folder, deleting by last modification date.