Open libeilin opened 5 years ago
This will not work out of the box as some other logic would need to change. we can leave this to see if it is popular or not
OK, thanks for your reply, because we have encountered some problems when compiling by ourselves. Therefore, we are looking for your help here.
If this requirement is made, I hope you can release it as soon as possible. At present, the data volume of one day is too large, and the ES query speed cannot keep up with it.
@openzipkin/elasticsearch any interest on this?
2019-01-01
pointing to 2019-01-01-00
and you switch that every hour. As long as the alias is pointing to a single index you can write to it. Pointing to multiple indices makes it read-only.@xeraa do you know which version rollover index was added? I agree the core issue here is size.
@adriancole 6.6 (the current version): https://www.elastic.co/guide/en/elasticsearch/reference/6.6/index-lifecycle-management.html
You can fully managed it through the Elasticsearch API, but Kibana also provides a UI for it. And as I said: Not open source but free to use (Basic license).
@libeilin before we experiment with a non-OSS feature, can you comment if rollover indexing is desirable? maintaining features has a cost, especially so with non OSS distributions (as it affects how we do testing) so we want to make sure there is user buy-in.
It is also possible for us to explore hourly indexes regardless
email related to this thread on our dev list https://lists.apache.org/thread.html/73c2efa69e3ff0a519c6b6c2f5e551159c34902c29df01b2703e9126@%3Cdev.zipkin.apache.org%3E
There's always Elastic Curator if you want to use Rollover, but are using OSS Elasticsearch (no Basic license). It's OSS, and requires no license.
@untergeek thanks for the pointer. I think you are pointing to this specifically right? https://www.elastic.co/guide/en/elasticsearch/client/curator/5.6/ex_rollover.html
To elaborate this approach, we'd need some more details about what this will take in practice in terms of curator config vs index template config, any extra processes curator needs to run, what if anything the aliasing implies when we do reads or writes. I wonder if someone has this setup with a zipkin site already (or anything that uses daily indexes and rollover with no client call changes needed)
We recently started using zipkin for opentracing. In our company also requirement is for monthly or weekly zipkin index. It would be great if you add this support.
Just as an idea: Maybe this is going a bit too deep down the rabbit hole for one datastore and it would make more sense to leave that part to Curator or ILM (by documenting the right configurations to be used)? There are various use cases about time based index patterns, rollover, deletion of data,... that are kind of solved externally already.
Just as an idea: Maybe this is going a bit too deep down the rabbit hole for one datastore and it would make more sense to leave that part to Curator or ILM (by documenting the right configurations to be used)? There are various use cases about time based index patterns, rollover, deletion of data,... that are kind of solved externally already.
Yes, curator is how people handle this today, and many can't store months of trace data either :P We currently mention to use curator for index management, but possibly someone can come up with an example https://github.com/apache/incubator-zipkin/blob/8e4ada890c1b4f0f21babaf1a2315af128aeb4f4/zipkin-storage/elasticsearch/README.md#indexes
In our company also requirement is for monthly or weekly zipkin index. It would be great if you add this support.
@singhabhinav03 could you elaborate on what you're trying to achieve that you cannot currently? The original request is to be able to have finer-grain indexes than daily because the data volume in one day is too large. Weekly or monthly indexes are only likely usable with relatively small amounts of tracing data.
I think this issue got stuck as we were worried about how to address varied granularity. @narayaruna opened #2767 which doesn't imply varied granularity.
If we limit this to hourly indexes, still anyone can use curator or similar to rescale these to daily, weekly monthly.. correct? cc @openzipkin/elasticsearch
If we limit this to hourly indexes, still anyone can use curator or similar to rescale these to daily, weekly monthly.. correct?
Not sure I'm reading this correctly, but combining hourly indices into a daily one (merging 24 indices) isn't easily possible — that would require a reindex (where you use a script to change the _index
field).
My concern with hourly indices is that this will be a lot of shards. Just using 1 primary and 1 replica you'll end up with 48 shards for a single day. Our recommendation is to have less than 20 shards per GB of heap and each shard should be around 10 to 50GB in size. I can see how this works out for some heavy users, but it will be a bad choice for many others.
IMO a combination of rollover and write index alias would be the more generic solution that gives users fewer chances for bad configurations.
Do you have like a sample app where I could add the right config to show how this works? Might be easier than discussing it.
@xeraa so I think the concern from @narayaruna is that with TB scale indexes, search, even with our cherry-picked indexing, require bumping read timeouts to 60s.. so more about query side than write side iiuc.
so the thinking is.. I wonder.. if for data sets that naturally fit the heap-per-shard guidance at hourly or less, then putting that data in hourly should make more sense than daily. Query side could be better optimized with this as instead of requesting a day index for a search, it could an hourly, without any special features...
am I missing something? (ps thanks for mentioning where hourly does not make sense! possibly we can do a discover check to warn if config doesn't make sense)
Yes, if you are looking at a short timeframe (like 1h). I'm not sure what the common access pattern is to be honest.
On the other hand if you have a filter on the timeframe and access it frequently enough then that will be cached and should also be pretty fast as well. I couldn't say how much win to expect (depends on so many factors including the access pattern — timeframe and frequency).
Literally, the default lookback is 1 hour, and currently, it will grab a day or possibly 2 if just past midnight, to form a query with. This is probably why Nara mentions this, as it lowers the blast of default to max 2 hours if just past the hour.
at any rate we could put a branch up and see how it goes. If isn't helpful we wouldn't do it, but for some sites this could be an easy to reason with, low-tech option to speed up some things.
Ack on the reindexing thing if someone needs to re-scale data. We can put more notes in the readme with knowledge gained here regardless of if the change is implemented.
PS I opened this because I think I was the one who came up with the hour search default :) https://github.com/openzipkin/zipkin/issues/2772
Sounds good on trying it out on a branch.
On the re-scaling: Rather than reindexing indices together, you could have an index template with 3 primary shards (just as an example for spreading the ingestion over 3 nodes), but once the index is readonly you could shrink it down to a single primary shard. That should be the better pattern for more parallelization at first and then reducing the number of shards later on. And this is just a question of index template and then Elastic Curator / ILM / ... — would probably just need a little documentation on the Zipkin side.
We too are facing similar issue. Our daily indices are growing into trillions of spans in daily index resulting into slow queries. @libeilin @codefromthecrypt Were you guys able to figure out any workaround this? We are stuck here with our es queries getting timed out
Was anyone able to find a work around this? Are there any MRs which are able to support hourly indices? Any help on above would be really appreciated
If Zipkin can write to an alias (without any date math) then you could set that up with with ILM (https://www.elastic.co/guide/en/elasticsearch/reference/current/overview-index-lifecycle-management.html) in the background. That this was part of Zipkin is probably more for historic reasons when Elasticsearch lacked any such features, but things have luckily changed by now.
+1
If Zipkin can write to an alias (without any date math) then you could set that up with with ILM (https://www.elastic.co/guide/en/elasticsearch/reference/current/overview-index-lifecycle-management.html) in the background. That this was part of Zipkin is probably more for historic reasons when Elasticsearch lacked any such features, but things have luckily changed by now.
I'm also very interesting in such a feature! When storing lots of traces with Zipkin, due to the size of the index, using ILM feature as shrinking or force merging becomes problematic. If we could write to an alias instead, we could automatically roll over the indices every 50GB, seriously reducing the batch size when shrinking and force merging
elasticSearch