Closed sladey closed 8 months ago
I have simplified the setup be returning to the naive load for caster and two small deferrable loads. After 10-20 runs of the optimizer it seems to stop returning (I assume from the solver) and then queuing starts and CPU increases.
We need to understand why are you getting infeasible runs. That will just mean that there are too many constraints for a possible solution. Do you have a battery? What are your settings for treat_def_as_semi_cont and set_def_constant?
Hey David, here is my config and the MPC call with its runtime params.
` hass_url: empty long_lived_token: empty costfun: self-consumption logging_level: DEBUG optimization_time_step: 30 historic_days_to_retrieve: 5 method_ts_round: first set_total_pv_sell: false lp_solver: COIN_CMD lp_solver_path: /usr/bin/cbc set_nocharge_from_grid: false set_nodischarge_to_grid: false set_battery_dynamic: false battery_dynamic_max: 0.9 battery_dynamic_min: -0.9 load_forecast_method: naive sensor_power_photovoltaics: sensor.powerwall_solar_now_watts sensor_power_load_no_var_loads: sensor.load_with_no_var_loads number_of_deferrable_loads: 2 list_nominal_power_of_deferrable_loads:
Here is the MPC call. the load price forecast is just a fixed values that follow my TOU tariff.
{ "load_cost_forecast": {{(state_attr('input_text.load_cost_tariff', 'rates'))}}, "pv_power_forecast": {{ pv | tojson }}, "prediction_horizon": 48, "alpha": {{states('sensor.emhass_dynamic_alpha')}}, "beta": {{states('sensor.emhass_dynamic_beta')}}, "soc_init": {{states('sensor.powerwall_charge')|float(0)/100}}, "soc_final": 0.15, "num_def_loads": 2, "P_deferrable_nom": [6000, 4500], "treat_def_as_semi_cont": ["false","true"], "set_def_constant": ["false","false"], "def_total_hours": [ {{states('sensor.def_total_hours_ev')}}, {{states('sensor.def_total_hours_pool_filter')}}] }
the PV forecast is from Solcast. the alpha and beta values are as follows:
when battery SOC < 98%, high beta, low alpha. when SOC > 98% low beta, high alpha.
def_hours for the EV is typically < 4 def hours for the pool pump is 10 at the start of the day and reduces according to the actual time the pool pump is run during the day. it then resets to 10 after midnight each day.
The behaviour I see is:
The other thing I sometimes see is "It was not possible to find a valid solver for Pulp package" and then additional python errors because many variables have the value of 'None'.
Here is the MPC call. the load price forecast is just a fixed values that follow my TOU tariff.
Can you expand this MPC call inside the developer tools so we can see the expanded variables?
Here is some debug logging from the rest command:
Url: http://localhost:5000/action/naive-mpc-optim. Status code 500. Payload: b'{ "load_cost_forecast": [0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.24915, 0.24915, 0.24915, 0.24915, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.24915, 0.24915], "pv_power_forecast": [4918, 9760, 12225, 13392, 13985, 13321, 12737, 12665, 13219, 13585, 13849, 13968, 13611, 13012, 12101, 10823, 9486, 8035, 6354, 4366, 2271, 583, 73, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 208, 937, 2893, 4746, 6423, 7611, 8852, 10018, 10860, 11482, 12271, 12907, 13264, 13475, 13526, 13416, 12506, 10914, 9266, 7682, 6137, 4889, 3752, 2299, 826, 208, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0], "prediction_horizon": 48, "alpha": 0.2, "beta": 0.8, "soc_init": 0.15, "soc_final": 0.15, "num_def_loads": 2, "P_deferrable_nom": [6000, 4500], "treat_def_as_semi_cont": ["false","true"], "set_def_constant": ["false","false"], "def_total_hours": [ 1, 6] }'
You can see that this call returned a 500 from the action. This is the other error I see (other than "status infeasible") in the EMHASS logs. I think there is something to learn by understanding the details of the exception being thrown when the solver is called. What the code does today is swallow the exception. I assume that the exceptions and the subsequent python errors are resulting in resources being used up in some way.
2023-11-03 08:00:00,230 - web_server - INFO - Setting up needed data 2023-11-03 08:00:00,232 - web_server - INFO - Retrieve hass get data method initiated... 2023-11-03 08:00:00,455 - web_server - INFO - Retrieving weather forecast data using method = list 2023-11-03 08:00:00,456 - web_server - INFO - Retrieving data from hass for load forecast using method = naive 2023-11-03 08:00:00,456 - web_server - INFO - Retrieve hass get data method initiated... 2023-11-03 08:00:00,735 - web_server - INFO - >> Performing naive MPC optimization... 2023-11-03 08:00:00,735 - web_server - INFO - Performing naive MPC optimization 2023-11-03 08:00:00,739 - web_server - INFO - Perform an iteration of a naive MPC controller 2023-11-03 08:00:02,144 - web_server - ERROR - It was not possible to find a valid solver for Pulp package 2023-11-03 08:00:02,144 - web_server - INFO - Status: Not Solved 2023-11-03 08:00:02,145 - web_server - WARNING - Cost function cannot be evaluated, probably None 2023-11-03 08:00:02,147 - web_server - ERROR - Exception on /action/naive-mpc-optim [POST] Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 1455, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 869, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 867, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.9/dist-packages/flask/app.py", line 852, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) File "/usr/local/lib/python3.9/dist-packages/emhass/web_server.py", line 204, in action_call opt_res = naive_mpc_optim(input_data_dict, app.logger) File "/usr/local/lib/python3.9/dist-packages/emhass/command_line.py", line 275, in naive_mpc_optim opt_res_naive_mpc = input_data_dict['opt'].perform_naive_mpc_optim( File "/usr/local/lib/python3.9/dist-packages/emhass/optimization.py", line 575, in perform_naive_mpc_optim self.opt_res = self.perform_optimization(df_input_data, P_PV.values.ravel(), P_load.values.ravel(), File "/usr/local/lib/python3.9/dist-packages/emhass/optimization.py", line 418, in perform_optimization opt_tp["P_grid"] = [P_grid_pos[i].varValue + P_grid_neg[i].varValue for i in set_I] File "/usr/local/lib/python3.9/dist-packages/emhass/optimization.py", line 418, in
opt_tp["P_grid"] = [P_grid_pos[i].varValue + P_grid_neg[i].varValue for i in set_I] TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'
I have reduced to one load, here is the call to the optimiser and it results in an infeasible status.
Url: http://localhost:5000/action/naive-mpc-optim. Status code: 201. Payload: b'{ "load_cost_forecast": [0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.46178, 0.24915, 0.24915, 0.24915, 0.24915, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.24915, 0.24915, 0.24915, 0.24915], "pv_power_forecast": [3270, 13392, 13985, 13321, 12737, 12665, 13219, 13585, 13849, 13968, 13611, 13012, 12101, 10823, 9486, 8035, 6354, 4366, 2271, 583, 73, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 208, 937, 2893, 4746, 6423, 7611, 8852, 10018, 10860, 11482, 12271, 12907, 13264, 13475, 13526, 13416, 12506, 10914, 9266, 7682, 6137, 4889, 3752, 2299, 826, 208, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0], "prediction_horizon": 48, "alpha": 0.2, "beta": 0.8, "soc_init": 0.15, "soc_final": 0.15, "num_def_loads": 1, "P_deferrable_nom": [6000], "treat_def_as_semi_cont": ["false"], "set_def_constant": ["false"], "def_total_hours": [ 1] }'
With the above set of def loads I am getting an optimal result. So I guess the main issues here is why the solver sometimes times out, what is the exception that the solver sometimes throws and does that lead to resource leakage. if you want to send a build with more debug info I am happy to install it. I can build an image locally from a dev branch if required.
Here is my latest run with two def loads and low def_total_hours:
Success. Url: http://localhost:5000/action/naive-mpc-optim. Status code: 201. Payload: b'{ "load_cost_forecast": [0.46178, 0.46178, 0.46178, 0.46178, 0.24915, 0.24915, 0.24915, 0.24915, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.19239, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915, 0.24915], "pv_power_forecast": [475, 393, 58, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 193, 901, 2812, 4591, 6245, 7639, 9116, 10439, 11362, 11894, 12670, 13400, 13876, 14144, 14151, 13879, 13175, 12127, 10955, 9656, 8010, 6414, 4835, 2948, 1177, 264, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0], "prediction_horizon": 48, "alpha": 0.2, "beta": 0.8, "soc_init": 0.75, "soc_final": 0.15, "num_def_loads": 2, "P_deferrable_nom": [4000,4000], "treat_def_as_semi_cont": ["false","false"], "set_def_constant": ["false","false"], "def_total_hours": [ 2, 1] }'
The EMHASS log file shows that it cannot find a feasible solution. Yet if you look at the graph of the published forecasts, it looks very feasible to me.
You can 4x30min blocks for the EV and 2x30min blocks for the pool pump.
@davidusb-geek how can I help you investigate this resource usage issue? With what seems like very feasible scenarios (in terms of PV forecast, un-controllable loads, and def_hours for each controllable load) I keep getting infeasible results and I need to restart the add-on every couple of days because it starts using 99% CPU and eventually calls to the MPC end point do not return.
Hi. Like said before this is a very strange issue. One thing that you can do is to carefully inspect and analyze the results thrown by EMHASS when obtaining un feasible situations. Copy the result table contents from the webui into an Excel spreadsheet and analyze row by row to look for any data condition that would lead you to something that is just not numerically possible. A second possible thing to try will be a different solver. You can try going into the docker container of the add-on and just install a new solver. Then provide the correct path to the new solver on the add-on configuration.
Hey David, I have tried several solvers since there are installed in the docker image already. The behaviour is the same. I'll do some excel number crunching.
@davidusb-geek when I put the optimizer results in excel here is what I see consistently:
P_PV + P_GRID + P_BAT - P_LOAD = (P_DEF0 + P_DEF1) + 1
Is this what you expect? It seems correct to me. It shows that generation plus grid plus storage is just over the controllable and uncontrollable loads.
Yes that seems correct. But your CPU problem may be related to some situations that are forcing an unfeasible condition. When that condition happen that Excel check should be failing. The hope is to find why it is failing. Maybe some unexpected values on the input data? Are you using nodered? I have seen in the past that it can be a source of problems...
When I collected those stats in excel, the solver was returning "infeasible". I am not using node-red. I am trying to keep things as simple as possible with the naive load forecast method and solcast providing the PV forecast. Do you think it is worth me running outside of HA? ie do a docker based remote setup on my laptop?
Ok I understand. We need to inspect what is causing the infeasible condition. Yes you can test all that locally, it will give you the liberty to directly inspect what is the problem. You can print the LP problem to check the constraints. Then add relaxation to see what changed to obtain a feasible status. Here is an example on how to do this: https://stackoverflow.com/questions/76158127/linear-relaxation-infeasible-in-python-using-pulp-but-still-get-an-objective-va
You can install EMHASS using the legacy method not the docker option, it will give you full control to inspect and modify the code to debug. Follow the instructions in the Readme, you will only need a PC with Python and you will need to create a virtual environment with the needed libraries. Its all in the Readme.
I have got things running in legacy mode. As expected the results are "optimal". I have tried to get the setup in legacy mode as close to the HA mode as possible. More work needed :)
Ok, yes try to get the same configuration. You said you tried different solvers on your HA instance. One guess is some incompatibility between solvers and your hardware. What hardware architecture is your HA instance using?
David, some success. I am now able to get reliable Optimal results on my HA install. I stripped back the configuration:
I am using the COIN / cbc solver on both installs.
Oddly there are still differences between running the same inputs and the result on HA (which happens to be optimal) and on my mac (which is infeasible). Both using the same solver. Would you expect this?
Oddly there are still differences between running the same inputs and the result on HA (which happens to be optimal) and on my mac (which is infeasible). Both using the same solver. Would you expect this?
I don't understand this bit. Your results on HA are using the add-on on a RPi4 and status is optimal. While on your mac you installed the legacy method and the result is infeasible. Is it that?
Yes that is the situation. I'll do more tests today but I am seeing different results for very very similar inputs.
The good thing is that by reducing the complexity of the inputs (I think soc_init and soc_final may be the important changes) I am back to getting optimal solutions.
David now that I have a local dev environment I might add some more error logging to help HA users debug problems with input data. Is that ok?
Sure any improvements are welcomed ;-)
Hi, did you managed to overcome these issues?
Feel free to reopen if needed.
I recently added new loads into my home that I am gradually bringing under control of EMHASS. In the meantime they are seen by EMHASS as uncontrolled loads (you can see these loads in the ML load forecast). This means there is reduced "space" to accomodate my existing controlled loads (EV). This means I am getting lots of "infeasible" runs of the solver. This seems to result in some resource leakage. I am see the CPU and memory usage of the EMHASS container increase over time and the same for the host machine (see screenshot below). This eventually leads to waitress task queuing and when the queue reaches a certain limit the CPU for the container gets pegged at 99%. While this is happening the solver is not bring run and so no HASS sensors are changing.
I am also getting an exception in this code that runs the solver which I cannot see the details of (because the code eats the exception) and I think speaks to something else going on, ie something different than just getting an infeasible result from the solver.
try: if self.lp_solver == 'PULP_CBC_CMD': opt_model.solve(PULP_CBC_CMD(msg=0)) elif self.lp_solver == 'GLPK_CMD': opt_model.solve(GLPK_CMD(msg=0)) elif self.lp_solver == 'COIN_CMD': opt_model.solve(COIN_CMD(msg=0, path=self.lp_solver_path)) else: self.logger.error("Invalid solver name passed") except Exception: self.logger.error("It was not possible to find a valid solver for Pulp package")
To Reproduce Repeated get infeasible results from the solver.
Expected behavior Managed CPU and memory usage
Screenshots
Home Assistant installation type
Your hardware
EMHASS installation type