Open myyong opened 3 years ago
Update: This is no longer relevant as we are re-populating entire table
The first set of readings (when I first saw them) are as follows:
id | timestamp | electricity_consumption | time_created | time_updated | sensor_id |
---|---|---|---|---|---|
17714 | 2020-10-25 01:00:00 | 29.34 | 2020-10-25 09:00:55.089717 | 17 | |
17763 | 2020-10-25 01:30:00 | 30.09 | 2020-10-26 06:01:23.337154 | 17 |
I have changed them to take the second set, which matches the UTC readings. The scripts I used were
UPDATE energy_data
SET electricity_consumption = 29.77
where id=17714
UPDATE energy_data
SET electricity_consumption = 29.56
where id=17763
As a result, running
select * from energy_data
WHERE timestamp>='2020-10-25 00:00:00' AND sensor_id=17
ORDER BY timestamp ASC
LIMIT 8;
now returns | id | timestamp | electricity_consumption | time_created | time_updated | sensor_id |
---|---|---|---|---|---|---|
17714 | 2020-10-25 01:00:00 | 29.77 | 2020-10-25 09:00:55.089717 | 17 | ||
17763 | 2020-10-25 01:30:00 | 29.56 | 2020-10-26 06:01:23.337154 | 17 |
Update: This is no longer relevant as we are re-populating entire table
In 2021, there is no data for 28th March 2021 01:00:00 and 28th March 2021 01:30:00. The readings resume at 28th March 2021 02:00:00.
id | timestamp | electricity_consumption | time_created | time_updated | sensor_id |
---|---|---|---|---|---|
17712 | 2020-10-25 00:00:00 | 29.31 | 2020-10-25 09:00:55.089717 | 17 | |
17713 | 2020-10-25 00:30:00 | 29.49 | 2020-10-25 09:00:55.089717 | 17 | |
17764 | 2020-10-25 02:00:00 | 29.6 | 2020-10-26 06:01:23.337154 | 17 |
Instead of incrementing timestamps, let's create a backup table, and reload the data from the 27/04/2020 00:30 till today.
create table utc_energy_data as (select * from energy_data) with no data;
\d utc_energy_data;
alter table utc_energy_data add column id2 serial primary key;
alter table utc_energy_data drop column id;
alter table utc_energy_data rename column id2 to id;
returns:
Table "public.utc_energy_data" | Column | Type | Collation | Nullable | Default |
---|---|---|---|---|---|
id | integer | ||||
timestamp | timestamp without time zone | ||||
electricity_consumption | double precision | ||||
time_created | timestamp without time zone | ||||
time_updated | timestamp without time zone | ||||
sensor_id | integer |
Insert data using \COPY utc_energy_data (timestamp, electricity_consumption) FROM '/Users/myong/Downloads/energy.csv' WITH CSV HEADER;
The format of the timestamps need to be in YYYY-MM-DD hh-mm-ss format. I switched the formats using Excel.
Check data using:
select * from utc_energy_data
WHERE timestamp>='2021-05-05 01:00:00'
AND timestamp<='2021-05-05 05:00:00'
AND sensor_id=17
ORDER BY timestamp ASC
LIMIT 4;
shows that this data matches UTC or UTC +00:00:00 NO DST:
timestamp | electricity_consumption | time_created | time_updated | sensor_id | id |
---|---|---|---|---|---|
2021-05-05 01:00:00 | 27.32 | 2021-05-06 21:00:00 | 17 | 17906 | |
2021-05-05 01:30:00 | 27.25 | 2021-05-06 21:00:00 | 17 | 17907 | |
2021-05-05 02:00:00 | 27.32 | 2021-05-06 21:00:00 | 17 | 17908 | |
2021-05-05 02:30:00 | 27.16 | 2021-05-06 21:00:00 | 17 | 17909 |
Example of matching data from Stark interface
Hi @myyong
I don't seem to be able to access data from either iweather or utc_energy_data.
I cannot access it directly for pgAdmin
error message: ERROR: permission denied for table utc_energy_data SQL state: 42501
Or from a direct query using psycopg2
InsufficientPrivilege: permission denied for table iweather
Accessing all other data is fine (i.e. 30Mhz, energy_data, tinytag_data)
Hi @meljsingh
I don't seem to be able to access data from either iweather or utc_energy_data.
I've changed access now, would you try again?
grant select on utc_energy_data to cropreader;
grant select on utc_energy_data to cropreader;
\z utc_energy_data
returns
Schema | Name | Type | Access privileges | Column privileges | Policies |
---|---|---|---|---|---|
public | utc_energy_data | table | cropdbadmin=arwdDxt/cropdbadmin+ cropreader=r/cropdbadmin |
Lessons learnt:
\du;
returns list of roles:
Role name | Attributes | Member of
-----------------|------------------------------------------------------------|------------------ azure_pg_admin | Cannot login, Replication | {} azure_superuser | Superuser, Create role, Create DB, Replication, Bypass RLS | {} cropdbadmin | Create role, Create DB, Replication | {azure_pg_admin} cropreader | | {}
Working now!
Checking that the data from the new table energy_data_utc is now correctly in UTC.
Two figures show that the new data grabbing algorithm currently is correct.
For clockchange in October 2020:
For clockchange in March 2021
Noting down here from Mel:
In general yes please put the sensor_id =16 in, but just to let you know that the names on Stark are actually the wrong way around. So data from sensor_id=16, named Carpenter’s Place in Stark is actually Clapham Common, and vice versa for sensor_id =17.
and
Furthermore, I’ve noticed that even with using the data from the energy_utc table, we’re getting an hour delay between temperature and energy readings, see picture attached. The blue line is the sensor temperature, while the red line is energy readings. The green line aligns with the red line, which makes sense as the green line represents the lights schedule calculated from energy readings. This hour delay affects the accuracy of the temperature forecasts which depend on the lighting schedule.
I'm looking at the energy consumption data, I can see that:
As @meljsingh says, the offset is because of the Daylight Savings. I can also see that Stark has changed their approach in these two years.
In 2020, they had 2 readings, at 01:00:00 to accommodate DST. Our web scraping took in the first set, which do not matched in the UTC readings. In 2021, there is no data for 28th March 2021 01:00:00 and 28th March 2021 01:30:00. The readings resume at 28th March 2021 02:00:00.
So there are 4 items that needs doing.