Closed horatiorosa closed 6 months ago
@pratishta pleasse do this one. I started and had conflicts along with the weird single vs double quote VS Code linting madness.
@horatiorosa Is there a reason we want to do 2024-02-13
and not 2024-02-15
? I realize this update is from yesterday but want to check in anyway
@horatiorosa Is there a reason we want to do
2024-02-13
and not2024-02-15
? I realize this update is from yesterday but want to check in anyway
Ah, Finn mentioned he found found issues with the prior data and was going to re-run the pipelines. We should use the latest date in DO for ACD and Decennial. At this point, looks like main/decennial/2020
and main/acs/2022
were both updated past what's referenced in this issue .Great catch. 🕵🏽♀️
Updated issue/16067-pff
with these changes in commit https://github.com/NYCPlanning/labs-factfinder-api/commit/7b182c5ad2422b36b588d6d8b5b7b864ee3522a0
I'm unable to pull the data locally using 2024-02-15
folders. I can run the migration successfully but I'll have an empty database.
I switched the ACS version constants to 2024-02-13
and was successful with ACS data but not decennial. I'm not sure exactly why though. At initial glance there's a size difference in metadata.json
between 2024-02-15
(297.07kb) and 2024-02-13
(418.41kb).
I'm not sure if this is what may be causing migration errors for @horatiorosa but something to look into.
I have empty tables for ACS 2010 and 2020, and Decennial 2010, 2020. Error log as follows:
ERROR: invalid input syntax for type double precision: "c"
CONTEXT: COPY tmp, line 453741, column c: "c"
STATEMENT: COPY tmp FROM STDIN WITH DELIMITER ',' CSV HEADER;
ERROR: invalid input syntax for type double precision: "c"
CONTEXT: COPY tmp, line 418564, column c: "c"
STATEMENT: COPY tmp FROM STDIN WITH DELIMITER ',' CSV HEADER;
ERROR: invalid input syntax for type double precision: "value"
CONTEXT: COPY tmp, line 15085642, column value: "value"
STATEMENT: COPY tmp FROM STDIN WITH DELIMITER ',' CSV HEADER;
ERROR: invalid input syntax for type double precision: "value"
CONTEXT: COPY tmp, line 15085642, column value: "value"
STATEMENT: COPY tmp FROM STDIN WITH DELIMITER ',' CSV HEADER;
A further note to my above error:
When running the migration on the develop
branch, I do get data for the ACS 2010, 2021 with 2022 empty and Decennial 2010 and 2020 table both contain data.
Turns out there was some misformatting in the CSVs that data engineering fixed for us and the new version folders with correct data and formatting is under 2024-02-20
. I changed the constants to reflect that in this commit a37cb65
Ticket
Issue 16067 Workflow branch strategy
Description
We need update the constants to point to the current version of ACS data in order to run migrations ensure PFF uses the 2022 ACS data using the following directions: Data Updates and Migrations
DECENNIAL_LATEST_TABLE_NAME
remains unchangedDECENNIAL_LATEST_VERSION
to 2024-02-15 *DECENNIAL_EARLIEST_VERSION
to 2024-02-15*ACS_LATEST_TABLE_NAME
to 2022 (same year as the most recent folder in DO spaces storage)ACS_LATEST_VERSION
to 2024-02-15 *ACS_EARLIEST_VERSION
to 2024-02-15 ** or most recent version available when we drill down into the most recent year in DO for ACS and Decennial
note: pr for reference