NYCPlanning / db-developments

🏠 🏘️ 🏗️ Developments Database
https://nycplanning.github.io/db-developments
8 stars 2 forks source link

add file for qaqc field distribution reports #524

Closed Oysters1874 closed 2 years ago

Oysters1874 commented 2 years ago

As we want to capture the # of new records of each possible type created since the previous version, I am thinking of using date_filed to measure the change. But I am not sure whether we should also look at date permitted or other date-related fields for this change. Also, I fetch all job status/job type info from the final_devdb table. I'd like to hear what others think about this issue.

SashaWeinstein commented 2 years ago

What is CAPTURE_DATE_PREV in version.env? Is this the date the previous version was run, or when the data for the previous data was pulled or something like that?

Oysters1874 commented 2 years ago

What is CAPTURE_DATE_PREV in version.env? Is this the date the previous version was run, or when the data for the previous data was pulled or something like that?

I am not sure about this. If that date doesn't mean when the data of a particular version is pulled (or data as of that date), I doubt whether we need to pull down the previous version of devdb if we want to really track the newly added records since the previous version.

abrieff commented 2 years ago

I feel like records filed in the last version might not give us the full range of job statuses we're hoping for (the job status field is related to the date_filed, date_permittd fields)

abrieff commented 2 years ago

Could

Date_LastUpdt - The date of the last update to the DOB record for the job filing. |   | This field is mapped from the dob_jobapplications field latestactiondate be instead what we want

abrieff commented 2 years ago

Maybe @td928 would know

td928 commented 2 years ago

What is CAPTURE_DATE_PREV in version.env? Is this the date the previous version was run, or when the data for the previous data was pulled or something like that?

I am not sure about this. If that date doesn't mean when the data of a particular version is pulled (or data as of that date), I doubt whether we need to pull down the previous version of devdb if we want to really track the newly added records since the previous version.

CAPTURE_DATE_PREV and CAPTURE_DATE represents the date the dob data were pulled since the open data updated on daily basis.

Oysters1874 commented 2 years ago

What is CAPTURE_DATE_PREV in version.env? Is this the date the previous version was run, or when the data for the previous data was pulled or something like that?

I am not sure about this. If that date doesn't mean when the data of a particular version is pulled (or data as of that date), I doubt whether we need to pull down the previous version of devdb if we want to really track the newly added records since the previous version.

CAPTURE_DATE_PREV and CAPTURE_DATE represents the date the dob data were pulled since the open data updated on daily basis.

Then I guess we can use CAPTURE_DATE_PREV to keep track of newly added changes in the current version?

td928 commented 2 years ago

@Oysters1874 as a flag, would you mind openning up a new PR to merge into dev instead. So that we don't have to do it main and dev comparison later.

SashaWeinstein commented 2 years ago

You can edit a PR to point to a different branch, so you actually don't have to open a new one

Oysters1874 commented 2 years ago

okay, will do that

Oysters1874 commented 2 years ago

lgtm!

I just modified the table name to make them all look consistent, but it shouldn't affect other codes.

Oysters1874 commented 2 years ago

looks good to me! Great job again, I love this jsonb_agg implementation, we will be re-using this pattern long after you've moved on to a real job. The one note I have is that you may want to wait until PR #526 is merged so your bash script goes in the right place. Or we could merge this first and change PR #526 afterwards. As long as we are all on the same page

sure, either works for me.

td928 commented 2 years ago

looks good to me! Great job again, I love this jsonb_agg implementation, we will be re-using this pattern long after you've moved on to a real job. The one note I have is that you may want to wait until PR #526 is merged so your bash script goes in the right place. Or we could merge this first and change PR #526 afterwards. As long as we are all on the same page

I am for letting Jingyi merging this first. Since the number of files change in the other PR is much greater therefore probably easier to change on that side.

Oysters1874 commented 2 years ago

looks good to me! Great job again, I love this jsonb_agg implementation, we will be re-using this pattern long after you've moved on to a real job. The one note I have is that you may want to wait until PR #526 is merged so your bash script goes in the right place. Or we could merge this first and change PR #526 afterwards. As long as we are all on the same page

I am for letting Jingyi merging this first. Since the number of files change in the other PR is much greater therefore probably easier to change on that side.

okay, I will merge it in.