NYCPlanning / db-data-library

📚 Data Library
https://nycplanning.github.io/db-data-library/library/index.html
MIT License
0 stars 1 forks source link

add hpd historical data #401

Closed damonmcc closed 1 year ago

damonmcc commented 1 year ago

related to https://github.com/NYCPlanning/data-engineering/issues/30

goals

changes

notes

alexrichey commented 1 year ago

@damonmcc (and cc @fvankrieken ) This all looks fine, but I suppose I don't fully understand the rationale for the inheritance pattern in the ./library/scripts folder. Looks like there's a ton of boilerplate... for example, why not just implement

    def runner(self) -> str:
        df = self.ingest()
        local_path = df_to_tempfile(df)
        return local_path

in a parent class, since those lines are reproduced all over. Seems like you could eliminate quite a few of these files.

And in the case of the script added here, what about having sheetname defined in the template, and have an ExcelScriptorInterface or something? Seems more traditionally OOP'y, but let me know if there's something I don't understand.

Maybe a bigger discussion. Otherwise LGTM!

alexrichey commented 1 year ago

06_20_2023_Lookback Legislation.xlsx also wasn't sure where this file comes from