Questions about the paper.

fremk commented 6 months ago

Hello, Thank you for your great work!

I was wondering if the calibration and optimization approaches as well as your LSTM model used in the study are available somewhere. This repo is for the transformer used for time series but I am still curious about your approach and the calibration and optimization steps that you conducted for the real buildings. Any insight would be appreciated.

Regards, Karim

maxjcohen commented 6 months ago

Hi,

The calibration and optimization approaches are out of the scope of this repo, and unfortunately cannot be released publicly. However, they are direct applications of the CMA-ES (from the cma library) and the NSGA-II (from the pygmo library). If you are planning on using both, I recommend using a more general optimization framework, such as pymoo or optuna.

The LSTM model can be found the the benchmark module, along with the other models analyzed in the paper.

fremk commented 3 months ago

Hello @maxjcohen, I hope you're doing well! I was playing around with pycma for the calibration process and I found it to be quite hard to manipulate. I just have a couple questions in regards to the calibration process:

Did you use CMA in a discrete or continuous variable domain? Considering your parameters (picture below) could be treated as both; either continuous between 0 and 1 as normalized values for the input of the metamodel or discrete, for example nb of occupants ∈ [1000,1200,1400...,2000] (which of course should be normalized before using them as input to the metamodel)? And as you mention in your paper, in the calibration paragraph, "In our experiments, the variables we adjust for fitting are constrained by the same ranges defined in the data sampling section.", I couldn't really tell if that means
1. the variables stick to the same ranges only meaning 1000 ≤ nb of occupants ≤ 2000 which would be continuous
2. or the variables domains are identical to the ones used in generating the data meaning that nb of occupants ∈ [1000,1200,1400,1600,1800,2000] which would be discrete
Do you know any efficient genetic algorithms that solves optimization problems for discrete variables or a mix of continuous and discrete variables? I found CMA not to be properly adapted to solve an optimization problem where there are discrete/integer variables.

maxjcohen commented 3 months ago

Hi @fremk ,

The CMA-ES algorithm is designed for continuous variables. Although there have been attempts to adapt it to discrete variables, most of them where either unclear or ineffective.

In our approach, we use the CMA-ES for calibration by treating every variable as continuous, which is far from being the most efficient way, but it worked. For the ranges displayed in Table 4, the discrete steps values only apply to the dataset generation, not the calibration.

Do you know any efficient genetic algorithms that solves optimization problems for discrete variables or a mix of continuous and discrete variables? I found CMA not to be properly adapted to solve an optimization problem where there are discrete/integer variables.

The best option I have found and experimented with so far are the NSGA-II and NSGA-III models. The former was used for the optimization process presented in the paper.

I was playing around with pycma for the calibration process and I found it to be quite hard to manipulate.

In this case, I can recommend generic optimization librairies, such as pymoo or optuna. They implement both the NSGA variants and the CMA-ES models.

Hope this helps, Max

maxjcohen / transformer

Questions about the paper. #64