Closed jdangerx closed 9 months ago
Nooo the published_parsed
field is in EDT and the runners run on UTC.
Someday we need to update the archiver to put the published_parsed
in UTC, but until now we will explicitly use UTC-4.
Nooo the published_parsed
field is actually in Eastern Time, which may or may not be Eastern Daylight Time. Updating to force America/New_York
.
Attention: 1 lines
in your changes are missing coverage. Please review.
Comparison is base (
2ddcc8c
) 93.29% compared to head (bdd2b4b
) 93.36%. Report is 3 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
For each table, we'll now have a row per context per filing, which includes all the relevant facts reported in that context.
In 0.8.3 and below, we relied on the caller to only request one filing per entity.
In 1.1.1 and below, we read in all the filings for an entity, but then deduplicated the table so that each fact would only be reported once. We did this by sorting by the report date, and then picking the last reported value for each fact.
Unfortunately, since report date is not granular enough, this lead to an ambiguous sort order, which led to some issues matching data between tables since they would be associated with different filing names.
We will stop trying to tinker with the data here, beyond bringing it into SQLite form, and do the deduplication within PUDL. That means we will include duplicate facts whenever there are multiple filings reporting the same fact.