To minimize the amount of data being sent to our database (and thus avoiding Vercel limits), this PR adds a new table that stores the checksum of the XLSX and geojson.gz files. Upon ingestion, we look to see if these checksums are stored in the DB, and if so, skip ingestion. Checksums are performed on a data stream to avoid holding the entire file in memory.
To minimize the amount of data being sent to our database (and thus avoiding Vercel limits), this PR adds a new table that stores the checksum of the XLSX and geojson.gz files. Upon ingestion, we look to see if these checksums are stored in the DB, and if so, skip ingestion. Checksums are performed on a data stream to avoid holding the entire file in memory.