Hitting memory limit of Pythonanywhere for DB update script

Have tried:

Defining dtypes for pandas read_csv
Making sure that no duplicate dataframes are being created during processing
Converting from tsv to parquet before processing

Yet to try:

[ ] Decompress .gz and use polars scan_csv to take advantage of lazyframe

Easy way out: Take last tconst entry of SQL DB, only process IMDb dataset starting from that tconst to the end of the dataset. (Lowers data required to be processed significantly but would be reliant on the IMDb dataset staying in the same order)

RasmusKard / what.watch

Hitting memory limit of Pythonanywhere for DB update script #34