hackforla / lucky-parking

Visualization of parking data to assist in understanding of the effects of parking policies on a neighborhood by neighborhood basis in the City of Los Angeles
https://www.hackforla.org/projects/lucky-parking.html
34 stars 59 forks source link

Create database updating function #655

Open gregpawin opened 2 months ago

gregpawin commented 2 months ago

Create function that uses the city data API to update a local database of parking citations. This will be used to update the citation database on a daily basis instead of downloading the whole database every time.

parcheesime commented 1 month ago

Utilizing a small sample, $order and ticket_number, I have implemented a function designed to integrate new ticket data with existing records. This function operates by first identifying the most recent ticket number in a current sample. It then fetches additional ticket records that have been issued since the last recorded ticket, ensuring no overlaps.

The integration process utilizes a predefined schema to standardize the incoming data, aligning new entries with the established data structure of the existing dataset. This schema specifies the expected fields and also sets default values for any missing data. Once the new data is fetched and normalized according to our schema, it is merged into the existing dataset, adding a specified number of new records — for example, the next five new tickets.

I'll try this function with an existing database this coming week.

parcheesime commented 3 weeks ago

Using the function, I increased the sample size. No duplicates so far. I began transferring the initial tickets into DuckDB and tesiting incrementally loading data into an in-memory DuckDB. I will do EDA on the data, check for duplicates, and give an update next week.