apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.5k stars 1.29k forks source link

Improve memory protection in multi stage query engine #13436

Open gortiz opened 4 months ago

gortiz commented 4 months ago

This is a meta-issue to track memory protection mechanism in multi-stage. Specifically, the following mechanisms are included in single-stage query engine but are not present in multi-stage:

hpvd commented 4 months ago

I like this kind of parent issues - its perfect to keep the overview :-)

Just an idea: if we use a syntax like this, we get a nice tasklist with checkboxes and can see the progress without even opening the issue

- [ ] OOM protection, killing heavy allocating queries when getting out of heap memory, see #13433
- [ ] Default limits, add a default limit if no limit is included, see #13434
- [ ] Approximate grouping algorithm, see #13435

Maybe someone from team / @Jackie-Jiang could ad a label "Parent-Issue" so we could even filter for this kind of issues...

inspired by https://github.com/apache/pinot/issues/13010

gortiz commented 4 months ago

I thought about using that syntax, but looks repetitive. Once the referenced issue is closed, it will be shown as purple instead of green (like it is right now).

I've changed the order on each point so now the ticket is listed before and therefore it is easier to see if the ticket is still open or not

hpvd commented 4 months ago

yeap understand. thanks for changes. Just fyi since I'm lazy: you only have to input the first point plus issue number, after that every return starts a new point...


- [ ] #13433
- [ ]  
gortiz commented 4 months ago

I know, but someone will need to update each checkbox each time one of the referenced tickets is closed, right? AFAIK they are not going to be checked automatically.

If someone needs to keep updating this ticket when the referenced is closed it may be the case we don't remember to do that and end up having something like:

So using checklist introduces a redundancy that is not very useful at the cost of a possible future inconsistency, which is problematic.