Closed anebz closed 2 years ago
I'll take this one, if you're okay with it :)
aaand I already have the first question: would it be possible to get some sample data (or ideally the whole dataset at some specific timepoint)?
aaand I already have the first question: would it be possible to get some sample data (or ideally the whole dataset at some specific timepoint)?
yeah I already texted you in Telegram :)
pull request sent #27
While I was making experiments, I missed some scrapings and data between some times is missing. When plotting the data for that day, it goes every 15mins until 15:30, and then it jumps to 17:00 because no data was gathered in between. I would like the values between those 2 times to be filled so that the x-axis always shows 15min intervals.
Simplified, given a list such as
[2, 3, 4, 0, 2, 3, 0, 0, 1, 0]
, there are 40
s, 4 missing values. I would like that that0
s be filled with the average value between the previous value and the next value. For the first0
, the previous value was 4, the next is 2, so the average is (4+2)/2=3. That 0 gets replaced with 3. For the next 0, there are 20
s in succession. The value before the0
s was 3, the one after is 1, so both0
s are replaced by (3+1)/2=2. For the last 0, it's at the end of the list. So we could just average it with 0. That0
would be replaced by (1+0)/2=0.5. After the algorithm, the list would look like this:[2, 3, 4, 3, 2, 3, 2, 2, 1, 0.5]
.This in our situation, we would have entries every 15mins, and sometimes there would be no entry. This would constitute a 0 in the previous example, and this entry is the one that should be filled. Sometimes there is just one entry missing in succession, sometimes there are several. Maybe the missing entry is the first/last entry in the day, at 7:00 or at 23:00. I put all 3 scenarios in the previous example. The time that needs to be filled we already know, all the 15min intervals. The occupancy, waiting, temperature and weather status we should fill with the averages between the previous and after data. All are numbers and easy to average except the weather status 😉 I look forward to your idea of how to average categorical data.
Usually I write a step-by-step guide on how to solve things, this time I'm giving you a higher-level overview of the task 😄 feel free to ask me if you have questions
USE SCIPY INTERPOLATE 1D