Closed moustakas closed 1 year ago
Significant progress on this ticket was made in the webapp-fuji-v2 branch of migrating the database to postgres. However, loading the 1.4M Fuji targets barfs with an out-of-memory error:
% time python load.py
Read 1397479 rows from /global/cfs/cdirs/desi/spectro/fastspecfit/fuji/catalogs/fastspec-fuji.fits
Row 1024
Row 2048
Row 4096
Row 8192
Row 16384
Row 32768
Row 65536
Row 131072
Row 262144
Row 524288
Row 1048576
Bulk creating the database.
Traceback (most recent call last):
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
psycopg2.OperationalError: cannot allocate memory for output buffer
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/global/cfs/cdirs/desi/spectro/fastspecfit/webapp/py/fastspecfit/webapp/load.py", line 1201, in <module>
main()
File "/global/cfs/cdirs/desi/spectro/fastspecfit/webapp/py/fastspecfit/webapp/load.py", line 1198, in main
Sample.objects.bulk_create(objs)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/models/query.py", line 815, in bulk_create
returned_columns = self._batched_insert(
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/models/query.py", line 1816, in _batched_insert
self._insert(
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/models/query.py", line 1790, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/models/sql/compiler.py", line 1657, in execute_sql
cursor.execute(sql, params)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 103, in execute
return super().execute(sql, params)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
with self.db.wrap_database_errors:
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/global/homes/i/ioannis/conda-envs/fastspecfit-webapp/lib/python3.10/site-packages/django/db/backends/utils.py", line 89, in _execute
return self.cursor.execute(sql, params)
django.db.utils.OperationalError: cannot allocate memory for output buffer
The load also takes a significant amount of time (>2 hours in an interactive Perlmutter node). The problem appears to be Django rather than postgres, so I'm not sure where the memory barf is happening.
Done in #107.
@dstndstn argues that the
sqlite3
database currently used by the web-app is not optimal for a few reasons. Let's use this ticket to track the requisite updates as we prepare https://fastspecfit.desi.lbl.gov for public release.