kubernetes-sigs / kube-scheduler-simulator

The simulator for the Kubernetes scheduler
Apache License 2.0
791 stars 135 forks source link

Issues with the integration of custom plugins into the simulator #311

Closed MichBenedetti closed 7 months ago

MichBenedetti commented 1 year ago

Good morning, I have a problem related to adding a custom plugin to the simulator configuration. To integrate the plugin, I followed the instructions provided in the documentation available at this link: ([https://github.com/kubernetes-sigs/kube-scheduler-simulator/blob/master/simulator/docs/custom-plugin.md]).

To determine if the issue was with my custom plugin or something else, I tried integrating the sample plugin as instructed, but it still throws an error. Specifically, when I add the plugin's name to the YAML file and click on apply, a generic 500 error occurs. Checking the command line, it seems that a certain "Wrapped" file is missing, as seen in the screenshots below.

What could be the problem? Is there any additional configuration that needs to be added beyond what is mentioned in the guide?

image

simulator-server    | I0717 15:14:33.621395       1 scheduler.go:190] shutdown scheduler...
simulator-server    | E0717 15:14:33.621558       1 scheduling_queue.go:1065] "Error while retrieving next pod from scheduling queue" err="scheduling queue is closed"
simulator-server    | I0717 15:14:33.621790       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginPrioritySort
simulator-server    | I0717 15:14:33.621854       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeUnschedulable
simulator-server    | I0717 15:14:33.621890       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeName
simulator-server    | I0717 15:14:33.621899       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginTaintToleration
simulator-server    | I0717 15:14:33.621917       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeAffinity
simulator-server    | I0717 15:14:33.621927       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodePorts
simulator-server    | I0717 15:14:33.621932       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeResourcesFit
simulator-server    | I0717 15:14:33.621940       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeRestrictions
simulator-server    | I0717 15:14:33.621948       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginEBSLimits
simulator-server    | I0717 15:14:33.621961       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginGCEPDLimits
simulator-server    | I0717 15:14:33.621997       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeVolumeLimits
simulator-server    | I0717 15:14:33.622004       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginAzureDiskLimits
simulator-server    | I0717 15:14:33.622008       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeBinding
simulator-server    | I0717 15:14:33.622036       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeZone
simulator-server    | I0717 15:14:33.622049       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginPodTopologySpread
simulator-server    | I0717 15:14:33.622059       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginInterPodAffinity
simulator-server    | I0717 15:14:33.622086       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginDefaultPreemption
simulator-server    | I0717 15:14:33.622095       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeResourcesBalancedAllocation
simulator-server    | I0717 15:14:33.622100       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginImageLocality
simulator-server    | I0717 15:14:33.622137       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginDefaultBinder
simulator-server    | I0717 15:14:33.622508       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="PrioritySort"
simulator-server    | I0717 15:14:33.622551       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeUnschedulable"
simulator-server    | I0717 15:14:33.622579       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeName"
simulator-server    | I0717 15:14:33.622605       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="TaintToleration"
simulator-server    | I0717 15:14:33.622615       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeAffinity"
simulator-server    | I0717 15:14:33.622619       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodePorts"
simulator-server    | I0717 15:14:33.622646       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeResourcesFit"
simulator-server    | I0717 15:14:33.622657       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeRestrictions"
simulator-server    | I0717 15:14:33.622688       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="EBSLimits"
simulator-server    | I0717 15:14:33.622722       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="GCEPDLimits"
simulator-server    | I0717 15:14:33.622756       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeVolumeLimits"
simulator-server    | I0717 15:14:33.622799       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="AzureDiskLimits"
simulator-server    | I0717 15:14:33.622835       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeBinding"
simulator-server    | I0717 15:14:33.622871       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeZone"
simulator-server    | I0717 15:14:33.622912       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="PodTopologySpread"
simulator-server    | I0717 15:14:33.622924       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="InterPodAffinity"
simulator-server    | I0717 15:14:33.622935       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="DefaultPreemption"
simulator-server    | I0717 15:14:33.622946       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeResourcesBalancedAllocation"
simulator-server    | I0717 15:14:33.622983       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="ImageLocality"
simulator-server    | I0717 15:14:33.623023       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="DefaultBinder"
simulator-server    | I0717 15:14:33.724296       1 scheduler.go:79] failed to start scheduler: create scheduler: initializing profiles: creating profile for scheduler name default-scheduler: PreFilterPlugin "NodeNumberWrapped" does not exist. restarting with old configuration
simulator-server    | I0717 15:14:33.724456       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginPrioritySort
simulator-server    | I0717 15:14:33.724486       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeUnschedulable
simulator-server    | I0717 15:14:33.724492       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeName
simulator-server    | I0717 15:14:33.724497       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginTaintToleration
simulator-server    | I0717 15:14:33.724501       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeAffinity
simulator-server    | I0717 15:14:33.724506       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodePorts
simulator-server    | I0717 15:14:33.724510       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeResourcesFit
simulator-server    | I0717 15:14:33.724548       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeRestrictions
simulator-server    | I0717 15:14:33.724578       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginEBSLimits
simulator-server    | I0717 15:14:33.724585       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginGCEPDLimits
simulator-server    | I0717 15:14:33.724589       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeVolumeLimits
simulator-server    | I0717 15:14:33.724592       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginAzureDiskLimits
simulator-server    | I0717 15:14:33.724595       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeBinding
simulator-server    | I0717 15:14:33.724598       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginVolumeZone
simulator-server    | I0717 15:14:33.724602       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginPodTopologySpread
simulator-server    | I0717 15:14:33.724606       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginInterPodAffinity
simulator-server    | I0717 15:14:33.724609       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginDefaultPreemption
simulator-server    | I0717 15:14:33.724617       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginNodeResourcesBalancedAllocation
simulator-server    | I0717 15:14:33.724622       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginImageLocality
simulator-server    | I0717 15:14:33.724626       1 plugins.go:266] Default plugin is explicitly re-configured; overridingpluginDefaultBinder
simulator-server    | I0717 15:14:33.724782       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="PrioritySort"
simulator-server    | I0717 15:14:33.724813       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeUnschedulable"
simulator-server    | I0717 15:14:33.724821       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeName"
simulator-server    | I0717 15:14:33.724826       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="TaintToleration"
simulator-server    | I0717 15:14:33.724830       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeAffinity"
simulator-server    | I0717 15:14:33.724838       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodePorts"
simulator-server    | I0717 15:14:33.724843       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeResourcesFit"
simulator-server    | I0717 15:14:33.724847       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeRestrictions"
simulator-server    | I0717 15:14:33.724852       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="EBSLimits"
simulator-server    | I0717 15:14:33.724856       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="GCEPDLimits"
simulator-server    | I0717 15:14:33.724860       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeVolumeLimits"
simulator-server    | I0717 15:14:33.724869       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="AzureDiskLimits"
simulator-server    | I0717 15:14:33.724896       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeBinding"
simulator-server    | I0717 15:14:33.724903       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="VolumeZone"
simulator-server    | I0717 15:14:33.724930       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="PodTopologySpread"
simulator-server    | I0717 15:14:33.724937       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="InterPodAffinity"
simulator-server    | I0717 15:14:33.724941       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="DefaultPreemption"
simulator-server    | I0717 15:14:33.724946       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="NodeResourcesBalancedAllocation"
simulator-server    | I0717 15:14:33.724955       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="ImageLocality"
simulator-server    | I0717 15:14:33.724960       1 default_plugins.go:144] "Default plugin is explicitly re-configured; overriding" plugin="DefaultBinder"
simulator-server    | E0717 15:14:33.927962       1 schedulerconfig.go:55] failed to restart scheduler: start scheduler:simulator-server    |     sigs.k8s.io/kube-scheduler-simulator/simulator/scheduler.(*Service).RestartScheduler
simulator-server    |         /go/src/simulator/scheduler/scheduler.go:84
simulator-server    |   - create scheduler:
simulator-server    |     sigs.k8s.io/kube-scheduler-simulator/simulator/scheduler.(*Service).StartScheduler
simulator-server    |         /go/src/simulator/scheduler/scheduler.go:171
simulator-server    |   - initializing profiles: creating profile for scheduler name default-scheduler: PreFilterPlugin "NodeNumberWrapped" does not exist
simulator-server    | {"time":"2023-07-17T15:14:33.928107228Z","id":"","remote_ip":"172.20.0.1","host":"localhost:1212","method":"POST","uri":"/api/v1/schedulerconfiguration","user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36","status":500,"error":"code=500, message=Internal Server Error","latency":306893144,"latency_human":"306.893144ms","bytes_in":2595,"bytes_out":36}
simulator-server    | W0717 15:14:52.314923       1 warnings.go:70] flowcontrol.apiserver.k8s.io/v1beta3 PriorityLevelConfiguration is deprecated in v1.29+, unavailable in v1.32+
simulator-etcd      | 2023-07-17 15:15:00.484036 I | mvcc: store.index: compact 14191
simulator-etcd      | 2023-07-17 15:15:00.484502 I | mvcc: finished scheduled compaction at 14191 (took 202.827µs)
sanposhiho commented 1 year ago

/kind bug

sanposhiho commented 1 year ago

/area simulator /assign

k8s-triage-robot commented 9 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

tmishina commented 9 months ago

I faced same issue and now unable to test my custom plugin. Is there any plan to fix the issue, or any hint or workaround to avoid the issue? Thanks!

sanposhiho commented 9 months ago

/remove-lifecycle stale /priority next-release

Sorry for keeping it for a long time. I'll get a time investigating this before our next patch release.

sanposhiho commented 9 months ago

https://github.com/kubernetes-sigs/kube-scheduler-simulator/pull/332 should fix this issue. As the PR description says, the core problem of the issue is that we didn't document one required step for registeredOutOfTreeMultiPointName. The PR will eliminate the undocumented step though, while waiting for the PR to get merged, you can fix this issue by registering your custom plugin name at registeredOutOfTreeMultiPointName like this:

    outOfTreeRegistries = runtime.Registry{
        nodenumber.Name: nodenumber.New,
    }

    registeredOutOfTreeMultiPointName = []string{
        nodenumber.Name,
    }
tmishina commented 9 months ago

@sanposhiho Thank you so much for your comment, now NodeNumber plugin works as expected. I will try again with my own custom plugin.

yz2001zzx commented 8 months ago

@tmishina how was this issue fixed? I tried editing the config like the following for the sample NodeNumber plugin-in in /kube-scheduler-simulator/simulator/scheduler/config/plugin.go as suggested above.

var (

    outOfTreeRegistries = runtime.Registry{
            nodenumber.Name: nodenumber.New,
    }

    registeredOutOfTreeMultiPointName = []string{
            nodenumber.Name,
    }

)

I still got the same status code 500 error after I added the NodeNumber plugin in the KubeSchedulerConfiguration via the GUI.

image

Do we have to move /kube-scheduler-simulator/simulator/docs/sample/nodenumber to somewhere?

If possible, please give some guidance with more details on how to integrate our custom plugin.

When I was trying to integrate my own custom plugin, I got the same error, so I tried the sample NodeNumber plugin but the result was the same even after adding nodenumber.Name inside registeredOutOfTreeMultiPointName = []string{}.

What I did for the NodeNumber plugin:

  1. Modify /kube-scheduler-simulator/simulator/scheduler/config/plugin.go as suggested above;

  2. Added the NodeNumber plugin in the KubeSchedulerConfiguration via the GUI following the GUI in https://github.com/kubernetes-sigs/kube-scheduler-simulator/blob/master/simulator/docs/custom-plugin.md

Any other tricks or missing step(s)? @sanposhiho

Much appreciated.

tmishina commented 8 months ago

@yz2001zzx Have you specify scheduler config in simulator/config.yaml?

# The path to a KubeSchedulerConfiguration file.
# If passed, the simulator will start the scheduler
# with that configuration. Or, if you use web UI,
# you can change the configuration from the web UI as well.
kubeSchedulerConfigPath: "docs/sample/debuggable-scheduler/scheduler.yaml"

kubeSchedulerConfigPath should be an absolute path or relative path from simulator/config.yaml.

a-c-dream commented 7 months ago

@yz2001zzx Did you solve your problem? I had the same problem as you. I also modified /kube-scheduler-simulator/simulator/scheduler/config/plugin.go and I also modified simulator/config.yaml like @tmishina said but when I add NodeNumber in Web UI I still I get error code 500. Is it possible that I have to rebuild after modifying it, but when I rebuild I get the error image @sanposhiho Can you offer some solutions?

sanposhiho commented 7 months ago

@a-c-dream Well, can you open another issue with the repro steps?