robusta-dev / krr

Prometheus-based Kubernetes Resource Recommendations
MIT License
3.05k stars 160 forks source link

Can We Add Low(Prometheus) Resource Mode ? #332

Closed EsqerYasen closed 1 month ago

EsqerYasen commented 2 months ago

Is your feature request related to a problem? Please describe. i have many cluster's. one of them have 1000 pod. every time when in run KRR , it will crash the prometheus . it only happen in one cluster .

Describe the solution you'd like it's because of the promql that run one command all pods in one namespcace . can we seperate these or add speed limit or calculate one pod at a time ?

Describe alternatives you've considered can have an option to calculate in slow or low resource mode .

Are you interested in contributing a PR for this?

Additional context

File "/app/kubernetes/krr/robusta_krr/core/integrations/prometheus/metrics/base.py", line 133, in              
                    _query_prometheus_sync                                                                                           
                        raise ValueError(f"Failed to run query: {data.query}") from e                                                
                    ValueError: Failed to run query:                                                                                 
                                count_over_time(                                                                                     
                                    max(                                                                                             
                                        container_cpu_usage_seconds_total{                                                           
                                            namespace="production_namespace",                                                                 
                                            pod=~"there is have so many pods ",                                                                                         
                                            container="some container"                                                    

                                        }                                                                                            
                                    ) by (container, pod, job)                                                                       
                                    [14d:75s]                                                                                        
                                )      
EsqerYasen commented 2 months ago

it successfully works (not Crash happend) when i add this:

            # regular query, lighter on preformance
            try:
                response = self.prometheus.safe_custom_query(query=data.query)
                # sleep 10s
                time.sleep(10)
            except Exception as e:
                raise ValueError(f"Failed to run query: {data.query}") from e
            results = response["result"]
            # format the results to return the same format as custom_query_range
            for result in results:
                result["values"] = [result.pop("value")]
            return results

to base.py line 133 . not elegant but useful.

aantn commented 2 months ago

Hmm, thanks for the update. Does the --max-workers flag help here?

EsqerYasen commented 2 months ago

Hmm, thanks for the update. Does the --max-workers flag help here?

I will try tomorrow will be update . Thanks

aantn commented 2 months ago

Thanks, let me know!

On Wed, Sep 11, 2024 at 6:00 PM esqer @.***> wrote:

Hmm, thanks for the update. Does the --max-workers flag help here?

I will try tomorrow will be update . Thanks

— Reply to this email directly, view it on GitHub https://github.com/robusta-dev/krr/issues/332#issuecomment-2343925292, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYUB73IJNA4VG4P4O7VI3ZWBLHXAVCNFSM6AAAAABNWBO4ZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBTHEZDKMRZGI . You are receiving this because you commented.Message ID: @.***>

EsqerYasen commented 1 month ago

Hmm, thanks for the update. Does the --max-workers flag help here?

Hi, this works! thanks, first I didn't see this flag on the document. this is very helpful. Thank you for your Great Job!

EsqerYasen commented 1 month ago

krr simple --max-workers 2 -p http://127.0.0.1:9090/prometheus