themotte / rDrama

This code runs https://www.themotte.org. Forked from https://github.com/Aevann1/rDrama
GNU Affero General Public License v3.0
25 stars 31 forks source link

Convert DateTime values to timezone-aware. #601

Closed zorbathut closed 1 year ago

zorbathut commented 1 year ago

See https://github.com/themotte/rDrama/pull/597 .

VietThan commented 1 year ago

I queried for all columns that are date/time columns, looks like there are 4 columns that are still timezone unaware:

|table_schema|table_name           |column_id|column_name               |data_type                  |datetime_precision|
|------------|---------------------|---------|--------------------------|---------------------------|------------------|
|public      |comments             |34       |state_user_deleted_utc    |timestamp with time zone   |6                 |
|public      |submissions          |37       |state_user_deleted_utc    |timestamp with time zone   |6                 |
|public      |tasks_repeatable     |6        |run_time_last             |timestamp without time zone|6                 |
|public      |tasks_repeatable     |8        |time_of_day_utc           |time without time zone     |6                 |
|public      |tasks_repeatable_runs|5        |completed_utc             |timestamp without time zone|6                 |
|public      |users                |91       |volunteer_last_started_utc|timestamp without time zone|6                 |
|public      |volunteer_janitor    |6        |recorded_utc              |timestamp without time zone|6                 |

Which columns should be converted if any? @zorbathut

zorbathut commented 1 year ago

do 'em all

(but make sure they're converted properly)

That said, there's also more that aren't tagged as datetime but really should be. Look for anything with the _utc suffix - edited_utc, is_pinned_utc, created_utc, stickied_utc, there's probably more.

This isn't going to be hard work, but I'm afraid there's a lot of it. If you find yourself getting bogged down midway through I would happily accept a half-finished pull request, and would much prefer it over none at all :)

VietThan commented 1 year ago

Cool, 28 columns. I'll probably do these one table at a time. I'm new to open source contribution in general.

|table_name           |column_name               |data_type                  |datetime_precision|
|---------------------|--------------------------|---------------------------|------------------|
|commentflags         |created_utc               |integer                    |                  |
|comments             |created_utc               |integer                    |                  |
|comments             |edited_utc                |integer                    |                  |
|comments             |is_pinned_utc             |integer                    |                  |
|comments             |state_user_deleted_utc    |timestamp with time zone   |6                 |
|commentvotes         |created_utc               |integer                    |                  |
|flags                |created_utc               |integer                    |                  |
|follows              |created_utc               |integer                    |                  |
|modactions           |created_utc               |integer                    |                  |
|notifications        |created_utc               |integer                    |                  |
|submissions          |created_utc               |integer                    |                  |
|submissions          |edited_utc                |integer                    |                  |
|submissions          |stickied_utc              |integer                    |                  |
|submissions          |bump_utc                  |integer                    |                  |
|submissions          |state_user_deleted_utc    |timestamp with time zone   |6                 |
|tasks_repeatable     |run_time_last             |timestamp without time zone|6                 |
|tasks_repeatable     |time_of_day_utc           |time without time zone     |6                 |
|tasks_repeatable     |created_utc               |integer                    |                  |
|tasks_repeatable_runs|completed_utc             |timestamp without time zone|6                 |
|tasks_repeatable_runs|created_utc               |integer                    |                  |
|usernotes            |created_utc               |integer                    |                  |
|users                |created_utc               |integer                    |                  |
|users                |unban_utc                 |integer                    |                  |
|users                |patron_utc                |integer                    |                  |
|users                |volunteer_last_started_utc|timestamp without time zone|6                 |
|viewers              |last_view_utc             |integer                    |                  |
|volunteer_janitor    |recorded_utc              |timestamp without time zone|6                 |
|votes                |created_utc               |integer                    |                  |

Here's my query for the future:

select 
    -- col.table_schema,
    col.table_name,
    --col.ordinal_position as column_id,
    col.column_name,
    col.data_type,
    col.datetime_precision
from
    information_schema.columns col
join information_schema.tables tab on
    tab.table_schema = col.table_schema
    and tab.table_name = col.table_name
    and tab.table_type = 'BASE TABLE'
where
    (col.data_type in ('timestamp without time zone', 
                        'timestamp with time zone',
                        'time with time zone',
                        'time without time zone',
                        'interval', 'date')
     or col.column_name ilike '%_utc'      
        )
    and col.table_schema not in ('information_schema', 'pg_catalog')
order by
    col.table_schema,
    col.table_name,
    col.ordinal_position;
VietThan commented 1 year ago

Keeping track of tables:

VietThan commented 1 year ago

uhhhh, @zorbathut , this issue is probably not ready to close yet :D

zorbathut commented 1 year ago

Oops, yeah, autoclosed as part of the submission :)

VietThan commented 1 year ago

Made a few more PRs @zorbathut @justcool393

zorbathut commented 1 year ago

PRs checked in! Keep it up, and thank you again!

VietThan commented 1 year ago

@zorbathut @justcool393 , I'm starting to tackle table volunteer_janitor. I see it's actually a relatively new column. I was thinking there are possibly two things that need to be done here but would like to get your thumbs up before proceeding as I am not familiar with the history of the project.

  1. Convert recorded_utc from timestamp without timezone to timestamp with timezone
    • I am assuming recorded_utc records timestamps AT TIMEZONE 'UTC' (so I can just rely on this SO)
  2. Rename column recorded_utc to created_datetimez
    • For consistency purposes.
zorbathut commented 1 year ago

(1) sounds good. For (2), I'd prefer "recorded"; it's maybe the time the row was created, but the important part is when the result is recorded. (This is an append-only table, essentially.) recorded_datetimez would be fine though.

zorbathut commented 1 year ago

If you wanted to add a note somewhere that this is an append-only table that would probably not be a bad idea :V Put it where you would've seen it!

(I don't think this is something we can get sqlalchemy to enforce, unfortunately)