NYCPlanning / ceqr-app-data-archive

(DEPRECATED)data pipelines for CEQR app, managed by data engineering
https://github.com/NYCPlanning/ceqr-app-data
1 stars 1 forks source link

only generate current school year school_buildings #10

Closed SPTKL closed 5 years ago

SPTKL commented 5 years ago
SCHOOL_YEAR = f'{datetime.date.today().year-1}-{datetime.date.today().year}'
print(f'current school year is {SCHOOL_YEAR}')
baolingz commented 5 years ago

@SPTKL could you explain this issue in more detail?

SPTKL commented 5 years ago

In the ceqr app, we only need to deliver the school buildings for each school year, currently we are computing all school years, note how in ceqr app database, there are only 9 records for lcgms.2018, because we only need the records for that school year

When you are doing dataloading, you should load data from source, https://www.nycenet.edu/PublicApps/LCGMS.aspx instead of the ceqr db

Once you load the complete dataset into recipes, everytime we run a build, the python script should look like:

pd.read_sql('SELECT * FROM doe_lcgms."2019/09/27" WHERE school_year={school_year}', con = recipe_engine)