Closed antonroman closed 3 years ago
https://ieeexplore.ieee.org/document/7858189?reload=true another interesting reading on the topic
I've created this script to search for NA values and I haven't found any, apparently... Those are good news, sure, but anyway, I will show you the script in our next meeting, so you can check if it's actually right. Meanwhile, I won't close the issue yet, at least until we verify together that the script is doing its job right.
The script looks fine, it makes sense since the data is obtained from Deicom APIs, they may process the data at some point before.
In any case it is good news. If we get any nan
value for lad values we could use a function like this to fill the values with the previous day load value for the same time (this would be for S02):
# fill missing values with a value at the same time one day ago
def fill_missing(values):
one_day =24
for row in range(values.shape[0]):
for col in range(values.shape[1]):
if isnan(values[row, col]):
values[row, col] = values[row - one_day, col]
If you could check this for both S02 and S05 files feel free to close the issue, good job! :-)
I did, so I close the issue
We should check the integrity of the provided data. It is useful for two reasons:
To complete the N/A values there are different strategies:
First please check if there are many N/A and missing values and then we'll decide what to do.
We could even compare different approaches to fill the missing data if there is a relevant number of corrupt rows. The best approach would be the one which gives gets the best forecast performance from the model.
On the other side, it is worth checking this paper (https://www.sciencedirect.com/science/article/pii/S2352467720303003) as it seems to explain how to deal with this problem. I'll try to read it as well before our meeting.
Thanks, great job!!