NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
14 stars 0 forks source link

test dcpy and add PostgresClient #234

Closed damonmcc closed 8 months ago

damonmcc commented 9 months ago

related to #192 and #17 resolves #137

refactors and tests some dcpy utils in preparation for building products in a shared persistent postgres DB and expanding the Template DB end-to-end testing

motivation

changes

alexrichey commented 9 months ago

Apologies, Still reading. Will give this a better look tomorrow morning!

alexrichey commented 9 months ago

@damonmcc What's the rationale for moving dcpy_tests into dcpy? I can think of a few reasons against, the primary one being that the tests shouldn't be part of the API for dcpy, and we wouldn't want our pure module code to have a dependencies upon dev dependencies, ie test frameworks. Also, now any module in dcpy can freely import dcpy.test.* which, while I don't think anybody would do that, it's conventional to make that a little more difficult. Also, you typically will want to easily distribute your package sans tests, (ie in our case, just copying the dcpy folder)

So to pick an outside example, boto3 has a strict separation between the two.

damonmcc commented 9 months ago

@alexrichey appreciate the module vs tests details! do you like the old name of dcpy_test or shortening it to tests?

TLDR: no compelling reason, will move em back!

I felt inclined to put it in the dcpy directory because this repo isn't only dcpy and the only tests in that directory were for dcpy. in the boto3 example, everything in that repo is related to that single python package, so isn't it kinda of boto3/boto3 and boto3/tests?

from a quick look at a random local virtual environment I have, pandas doesn't have a strict separation. I see the tests in my file system. same for packages like jsonschema, scipy, sklearn, matplotlib, numpy, and others

alexrichey commented 9 months ago

@damonmcc I suppose this is all pretty build-system specific, but I think you're touching on something correct though - it would be nice to have the tests grouped with the package, not in the project root. And maybe the stuff @fvankrieken is doing with setuptools would allow us to exclude the dcpy.test folder in builds.

fvankrieken commented 9 months ago

@damonmcc I suppose this is all pretty build-system specific, but I think you're touching on something correct though - it would be nice to have the tests grouped with the package, not in the project root. And maybe the stuff @fvankrieken is doing with setuptools would allow us to exclude the dcpy.test folder in builds.

We could have src and tests within the dcpy folder

alexrichey commented 9 months ago

There's also this approach, which has a somewhat telescoping folder structure, but is nicely structured to group functionality while also preserving namespaces. It would make me a little bit sad to introduce src into our module names, e.g. dcpy.src.utils.*.

But it's neat that we can do this stuff now, via setuptools

damonmcc commented 8 months ago

@alexrichey @fvankrieken

just a ping. hope the new commits cover everything, doesn't seem worth doing a dcpy.src folder change in this PR

probably best to do a squash & merge with this one?