ecmwf / ecpoint-calibrate

Interactive GUI (developed in Python) for calibration and conditional verification of numerical weather prediction model outputs.
GNU General Public License v3.0
21 stars 8 forks source link

Be able to specify the lead times to be used in the creation of the ascii or parquet file #130

Open EstiGascon opened 3 years ago

EstiGascon commented 3 years ago

When we create an ascii or parquet table, we have to specify the "Model Data - General Parameters". But there we cannot specify the lead times or steps that we want to analyse, only the "Interval between forecast´s validity times". This is a problem when we have only 1 step (00 UTC for example), that you cannot write "0" in the Interval between forecast´s validity times, because it is not allowed. Then, you have to decide what to write there, but in reality, you do not have more than 1 validity time. I tried writing "1", but during the computation it is trying to find validity times 1, 2, 3....and not finding them so it gives warnings all the time in the log. Then, I tried writing "24" and it worked, but it is not intuitive for the users.

Also, I think that if we can specify the number of validity times that we want to use, we could choose to compute fewer times than the ones available in the predictors folders, which can be useful to test different sizes of databases without having to remove the files inside the folders.

FatimaPillosu commented 3 years ago

@EstiGascon , could you please explain better what is the problem here?

EstiGascon commented 3 years ago

Yes, the problem is this. Let's think that we have only one lead time per file , for example lead time = 0h. Then, when you have to fill the field "Interval between forecast´s validity times", what do you write? I would write a value of 0, because it is the only time that we are using in the calibration. However, the software does not allow to introduce 0. Then, I tested other values and the software accepted "24", which is fine, but I am not sure if it is so intuitive for the users.

Also, I would like to be able to choose the number of lead times that I use for the calibration. For example, imagine that I have 39 lead times in the parameters folders, but I only want to test hourly data up to 24h. The software does not allow you to do it now, so you have to use all the data available in the folder (so from 0 to 39h) or remove the times from 24h to 39 h from the folder.

FatimaPillosu commented 3 years ago

Hi @EstiGascon , yes, what you say is correct, if you want to test only up to 24h then you will need to delete the values up to 24h because the software was thought to use all the data that you have in the folder. @onyb , Can we specify in the steps section the first and the last step that we want to use, as well as the discreatization that we have already. in this way we would solve both Esti's problems because if you want to test only one lead time, let's say t+0, then you will put in both, first and last step the value of 0. Then if the first and last step are the same, the box where to add the discretization could be disabled as there would be no need to specify that value. Cheers, Fatima