NYCPlanning / db-data-library

📚 Data Library
https://nycplanning.github.io/db-data-library/library/index.html
MIT License
0 stars 1 forks source link

Use script to add encoding for `doe_lcgms` #352

Closed td928 closed 1 year ago

td928 commented 1 year ago

Third time the charm hopefully for #340

After @mbh329 facdb ingestion would throw an encoding error on the doe_lcgms which took me back to the drawing board and reactivate the script process to ingest the dataset.

The new workflow requires a two-step process which is documented in the library/templates/doe_lcgms.yml. First, manually download the .xls file then convert it into a utf-8 encoding csv. Then after it is moved to the tmp folder, the library/script/doe_lcgms.py script can be run to finish the upload.

ingest_csv()

I opted to write another separate function for the new process of ingesting the csv because I don't feel entirely comfortable yet to delete all the works from before to work with the aspx endpoint. I did add additional documentation on this new work and it is fairly simple in its functionality.

mbh329 commented 1 year ago

I am thinking that maybe we can test this after we merge in PR #351

td928 commented 1 year ago

I am thinking that maybe we can test this after we merge in PR #351

I tested on this branch and is working fine for now. Not sure if it is worth it to also test it with the new container?