Closed acocac closed 1 month ago
Hey @acocac, thanks for raising this and the MWE. So you have two target sets for the two lead times, and you want to compute unnormalised RMSE in Kelvin for the first lead time. The model.predict
interface is the intended way to get unnormalised predictions for computing unnormalised metrics. I've recently improved DeepSensor's forecasting functionality in deepsensor v0.4
which fixes model.predict
forecast outputs; see https://github.com/alan-turing-institute/deepsensor/issues/130 and https://github.com/alan-turing-institute/deepsensor/pull/132.
However, in the MWE, you are not using the data_processor in the right way. .map_array
is intended for a single array, not a list, so I suggest we keep the interface as-is. As a workaround, keeping the current approach:
# Don't do this:
# mean = data_processor.map_array(model.mean(task), target_var_ID, unnorm=True)
# true = data_processor.map_array(task["Y_t"][0], target_var_ID, unnorm=True)
# Do this:
lead_time_idx = 0
mean = model.mean(task)[lead_time_idx]
true = task["Y_t"][lead_time_idx]
error = np.abs(mean - true)
error_unnormalised = data_processor.map_array(error, target_var_ID, unnorm=True, add_offset=False)
But I'd suggest updating DeepSensor and using model.predict
:-)
I start experimenting a forecasting set up in DeepSensor (see a MWE in colab). The example below shows how I define a TaskLoader for predicting air temperature in the next two days (lead times):
Then I reuse the training procedure suggested in DeepSensor tutorials. However, the training stops and gives an error when computing RMSE for the validation tasks.
My guess is that some changes should be required in
map_array
when considering multiple targets. I suggest recognising the object type ofdata
below. If it's a list, then perform the multiply operator per element, in this casenp.array
.https://github.com/alan-turing-institute/deepsensor/blob/6de4ddb566db64d35a93fa5e8e1ab4f4327bb8e4/deepsensor/data/processor.py#L518