Questions about masking and mask dependencies during train/val phases

Hello everyone,

I am currently using TorchSpatiotemporal to conduct experiments for my Master's thesis in Data Science and Engineering under the supervision of Professor Paolo Garza.

The dataset I am working with is the SDPWF dataset, which was the main subject of the Baidu KDD competition in 2022. This dataset comprises data from over 100 sensors (wind turbines), recording approximately 10 different channels every ten minutes for 245 days. My task involves performing forecasting on this data. The objective is to compare various spatial-temporal deep learning architectures to understand how incorporating spatial information can improve prediction accuracy.

I have set up the necessary features and initialized the SpatioTemporalDataset and SpatioTemporalDataModule classes. Additionally, I have configured the Predictor and Trainer environment (see my Colab notebook here). I successfully trained a GraphWaveNetModel on this data by creating an SDPWFDataset class extending DatetimeDataset. The input dataframes are formatted correctly, with a datetime Pandas index representing the temporal dimension and a multi-column index mapping each wind turbine to its recorded channels. I also generate a dataset mask, a boolean dataframe indicating data availability for specific timeslots and wind turbines.

I am seeking clarification on the dataset mask, as I couldn't find much information in the documentation or GitHub repository. My specific questions are:

Can you explain more clearly what is the purpose of the mask?
To include the mask as an input to my neural network (referred to as x), should I move it to the covariates, or is it automatically appended to x by the DataModule class?
How should I use this mask to filter out some ground truth values with corresponding predictions and adjust the output loss accordingly for each training pass? My dataset contains some missing target values, and I need to mask them out to maintain consistency in loss evaluation. (Refer to Section 4.1 of the SDPWF paper: "In some cases, the wind turbines are stopped for reasons such as renovation or to avoid overloading the grid. In these instances, the actual generated power is unknown and should not be used for model evaluation.")
As I understand you have some metrix called MaskedSomething, but in that case they are supposed to mask out only nan values, and I don't see how to use them at training time to mask out missing ground truth values relying on the mask. Both, because I don't know how to retrieve the mask (GraphWaveNetModel only accepts x, u, edges as input) and because the masked metrics do not allow me to do so...

I have summarized the issue here, but please feel free to ask for additional details if needed. Feel also free to correct any misunderstanding here. Thank you for your support!

Best regards,

Hi @donofiva

In forecasting models, by default, the mask is simply used to avoid computing the loss on data points that are missing.
Adding the mask as a covariate is the safest option if you just want it to be concatenated to the input. Note that if you add a covariate called mask this would be overridden by the default mask of the dataset (which is aligned with respect to the forecasting horizon), you can use a different name to avoid this (e.g., input_mask).
This is done automatically as long as you include a mask attribute in your dataset and feed it to the appropriate modules.
You can specify the mask to filter out any value you want. As already mentioned, any value for which the mask is set to 0 is filtered out when computing the loss, if you want to do more complex operations you would have to extend each model and add mask as an input.

I hope this helps.

Andrea

TorchSpatiotemporal / tsl

Questions about masking and mask dependencies during train/val phases #38