NYCPlanning / data-engineering-qaqc

streamlit app for data engineering
https://edm-data-engineering.nycplanningdigital.com
1 stars 0 forks source link

fix pluto qaqc for 23v1 #254

Closed damonmcc closed 1 year ago

damonmcc commented 1 year ago

issues

changes

notes [in-progress]

fvankrieken commented 1 year ago

Seems good, the one thing that I would comment here is that given that we're comparing two different major versions here, it might be good to add some customization to what older diff is being used - as you mentioned to me, we wouldn't expect a big diff between 22v3 and 22v3.1 (and it's the first... patch? Extra minor? release we've done), so it might be more helpful to compare diff 23v1 - 22v3.1 to 22v3.1 - 22v2 (don't know if that's possible, do we pre-process these diffs?). But the just 22v3 - 22v2), or 22v1 - 21v3, etc.

That might be out of scope of this PR though, more a general improvement that would be good for the qc process.

damonmcc commented 1 year ago

@fvankrieken

Seems good, the one thing that I would comment here is that given that we're comparing two different major versions here, it might be good to add some customization to what older diff is being used - as you mentioned to me, we wouldn't expect a big diff between 22v3 and 22v3.1 (and it's the first... patch? Extra minor? release we've done), so it might be more helpful to compare diff 23v1 - 22v3.1 to 22v3.1 - 22v2 (don't know if that's possible, do we pre-process these diffs?). But the just 22v3 - 22v2), or 22v1 - 21v3, etc.

definitely agree it isn't ideal to see two different types of comparisons (major-minor and minor-major)

and yea since the diffs are pre-processed (generated during a build and appended to to files like qaqc_mismatch.csv) the attempt I just made to do "Use these 3 drop down boxes to pick any 3 versions you want" failed because the pair string 22v3.1 - 22v2 isn't in any qaqc files

fvankrieken commented 1 year ago

@fvankrieken

Seems good, the one thing that I would comment here is that given that we're comparing two different major versions here, it might be good to add some customization to what older diff is being used - as you mentioned to me, we wouldn't expect a big diff between 22v3 and 22v3.1 (and it's the first... patch? Extra minor? release we've done), so it might be more helpful to compare diff 23v1 - 22v3.1 to 22v3.1 - 22v2 (don't know if that's possible, do we pre-process these diffs?). But the just 22v3 - 22v2), or 22v1 - 21v3, etc.

definitely agree it isn't ideal to see two different types of comparisons (major-minor and minor-major)

and yea since the diffs are pre-processed (generated during a build and appended to to files like qaqc_mismatch.csv) the attempt I just made to do "Use these 3 drop down boxes to pick any 3 versions you want" failed because the pair string 22v3.1 - 22v2 isn't in any qaqc files

Simplest solution might just be to have another dropdown for the "Diff for comparison"

damonmcc commented 1 year ago

Simplest solution might just be to have another dropdown for the "Diff for comparison"

I like that. that'd still be limited to those diffs that have been generated but is more flexible than the current

[
  v1,
  v1 - 1, 
  v1 - 2,
]
damonmcc commented 1 year ago

seems fair to consider what @fvankrieken is suggesting as a worthy enhancement, but out of scope for this "bug" fix

per the urgency of wanting to build PLUTO on main, allowing Lynn to use the QAQC app to inspect 23v1, and Finn being in onboarding all day: @AmandaDoyle or @mbh329 wanna review the changes here?

mbh329 commented 1 year ago

@damonmcc and @fvankrieken I think the proposed changes to the qaqc make sense to me. It would be nice to have the dropdown where we can compare versions more dynamically but is outside the scope of this PR as yall mentioned. I tested the changes locally and they look good to me

damonmcc commented 1 year ago

deploy to dokku run after having merged this PR