mathesar-foundation / mathesar

Web application providing an intuitive user experience to databases.
https://mathesar.org/
GNU General Public License v3.0
2.33k stars 321 forks source link

[META] Pre-existing PostgreSQL compatibility #3199

Closed mathemancer closed 9 months ago

mathemancer commented 1 year ago

We want to be able to connect Mathesar to an arbitrary pre-existing PostgreSQL database. However, a number of problems may arise when doing so. This meta-issue is to track work associated with these problems.

Versions

Here we track info about which versions of PostgreSQL work with Mathesar.

Version Install Tests Notes Easy Fix
9 fail N/A ERROR: cannot cast type jsonb to integer at character 9193.

Relevant portion of 0_msar.sql: line 247
WHEN jsonb_typeof(col)='number' THEN msar.get_column_name(rel_id, col::integer)
maybe
10 fail N/A Same error as version 9 maybe
11 fail N/A The pg_attribute table doesn't include attgenerated maybe
12 fail N/A It seems the rules for type coercion are more strict than in 13. In particular, regtype isn't automatically coerced to text. yes
13 pass pass This is the version we develop on N/A
14 pass fail There seems to be some difference in default timestamp precision invalidating a test yes
15 pass fail It seems like there's some difference in the error returned by unsupported type options yes

Primary key problems

Here we list some info about issues that arise with different primary key setups.

No Primary key

Multicol Primary key

Why can one see the data in the record selector and data explorer, but not the main table page?

Non-generated primary key

Most issues stem from our treatment of table records as resources in the RESTful sense

Non-sequential primary keys

Foreign key problems

Multicolumn foreign key

Open Source Shakespeare set

This section lists issues found while tinkering with the Open Source Shakespeare data set.

Composite type support

Scale

Many tables

10 tables

No issues found

100 tables

500 tables

1000 tables

Many columns

Below are tables with some test results. The method is to create X total columns, split over m tables of n columns each. Then load the tables main page for the schema, and look at the repsonse time of /api/db/v0/tables/. Choose a random table, click it, and load its table page. Observe the response time of /api/db/v0/tables/<id>/columns/. Create a new column for the table, observe the time it takes for the relevant POST request to return. Otherwise, click around and look for strangeness or breakages.

Single table

# columns .../tables/ .../columns/ add column Notes
30 0.2s 2.5s 0.5s Works fine. Reasonable performance
50 0.4s 4.5s 2.8s Works okay, slower performance
100 1.2s 10.6s 0.5s single table page performance becoming a problem. No breakages
250 10s 55s 0.6s slow table page, slow schema page, no breakages
500 75s 363s 1.5s Slow schema page, unusably slow table page, no visible breakages (but it's hard to check)
750 315s 870s
1000

Total columns: 1000

# tables # columns per table .../tables/ .../columns/ add column Notes
20 50 6.5s 4s 1s Reasonable, but not great performance. No breakages
10 100 13s 10.5s 1.5s Same, but slower
5 200 27s 36s 1.8s Sloooower. No breakages though
2 500 116s 370s 2.1s Unusably slow. No breakages

Total columns: 1500

# tables # columns per table .../tables/ .../columns/ add column Notes
30 50 9s 4s 2.5s Filter/sort/group work, albeit slowly. No breakages found
15 100 16s 9s 1.1s Filter/sort/group work, slowly. No breakages
10 150 30s 22s 2.8s Same as above, but slower
5 300 85s 88s 1.7s Far slower; curiously, columns endpoint takes longer in single record view
3 500 180s 365s 2.8s Unusably slow, but no apparent breakages
2 750 360s 900s 8.5s Extemely slow. Not all columns showed on the single table or any dropdown involving columns, due to limiting to 500

Effect of DB Privileges on Mathesar

pavish commented 11 months ago

Error when installing, if mathesar user does not have ownership/privileges to a schema.

mathesar_db                     | 2023-09-14 14:35:30.721 UTC [792] ERROR:  permission denied for schema bulbasaurschema
mathesar_db                     | 2023-09-14 14:35:30.721 UTC [792] STATEMENT:  
mathesar_db                     |               SELECT a.attname,
mathesar_db                     |                 pg_catalog.format_type(a.atttypid, a.atttypmod),
mathesar_db                     |                 (
mathesar_db                     |                   SELECT pg_catalog.pg_get_expr(d.adbin, d.adrelid)
mathesar_db                     |                   FROM pg_catalog.pg_attrdef d
mathesar_db                     |                   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum
mathesar_db                     |                   AND a.atthasdef
mathesar_db                     |                 ) AS DEFAULT,
mathesar_db                     |                 a.attnotnull,
mathesar_db                     |                 a.attrelid as table_oid,
mathesar_db                     |                 pgd.description as comment,
mathesar_db                     |                 a.attgenerated as generated,
mathesar_db                     |                                 (SELECT json_build_object(
mathesar_db                     |                       'always', a.attidentity = 'a',
mathesar_db                     |                       'start', s.seqstart,
mathesar_db                     |                       'increment', s.seqincrement,
mathesar_db                     |                       'minvalue', s.seqmin,
mathesar_db                     |                       'maxvalue', s.seqmax,
mathesar_db                     |                       'cache', s.seqcache,
mathesar_db                     |                       'cycle', s.seqcycle)
mathesar_db                     |                   FROM pg_catalog.pg_sequence s
mathesar_db                     |                   JOIN pg_catalog.pg_class c on s.seqrelid = c."oid"
mathesar_db                     |                   WHERE c.relkind = 'S'
mathesar_db                     |                   AND a.attidentity != ''
mathesar_db                     |                   AND s.seqrelid = pg_catalog.pg_get_serial_sequence(
mathesar_db                     |                       a.attrelid::regclass::text, a.attname
mathesar_db                     |                   )::regclass::oid
mathesar_db                     |                   ) as identity_options                
mathesar_db                     |               FROM pg_catalog.pg_attribute a
mathesar_db                     |               LEFT JOIN pg_catalog.pg_description pgd ON (
mathesar_db                     |                   pgd.objoid = a.attrelid AND pgd.objsubid = a.attnum)
mathesar_db                     |               WHERE a.attrelid = 19317
mathesar_db                     |               AND a.attnum > 0 AND NOT a.attisdropped
mathesar_db                     |               ORDER BY a.attnum
mathesar_db                     |           
mathesar_service                | Traceback (most recent call last):
mathesar_service                |   File "/code/mathesar/models/base.py", line 711, in __getattribute__
mathesar_service                |     return super().__getattribute__(name)
mathesar_service                | AttributeError: 'Column' object has no attribute 'primary_key'
mathesar_service                | 
mathesar_service                | During handling of the above exception, another exception occurred:
mathesar_service                | 
mathesar_service                | Traceback (most recent call last):
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1799, in _execute_context
mathesar_service                |     self.dialect.do_execute(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute
mathesar_service                |     cursor.execute(statement, parameters)
mathesar_service                | psycopg2.errors.InsufficientPrivilege: permission denied for schema bulbasaurschema
mathesar_service                | 
mathesar_service                | 
mathesar_service                | The above exception was the direct cause of the following exception:
mathesar_service                | 
mathesar_service                | Traceback (most recent call last):
mathesar_service                |   File "/code/install.py", line 49, in <module>
mathesar_service                |     main()
mathesar_service                |   File "/code/install.py", line 25, in main
mathesar_service                |     management.call_command('migrate')
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
mathesar_service                |     return command.execute(*args, **defaults)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
mathesar_service                |     output = self.handle(*args, **options)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
mathesar_service                |     res = handle_func(*args, **kwargs)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 267, in handle
mathesar_service                |     emit_post_migrate_signal(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/core/management/sql.py", line 48, in emit_post_migrate_signal
mathesar_service                |     models.signals.post_migrate.send(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 177, in send
mathesar_service                |     return [
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 178, in <listcomp>
mathesar_service                |     (receiver, receiver(signal=self, sender=sender, **named))
mathesar_service                |   File "/code/mathesar/apps.py", line 16, in _prepare_database_model
mathesar_service                |     make_sure_initial_reflection_happened()
mathesar_service                |   File "/code/mathesar/state/base.py", line 8, in make_sure_initial_reflection_happened
mathesar_service                |     reset_reflection()
mathesar_service                |   File "/code/mathesar/state/base.py", line 27, in reset_reflection
mathesar_service                |     _trigger_django_model_reflection(db_name)
mathesar_service                |   File "/code/mathesar/state/base.py", line 31, in _trigger_django_model_reflection
mathesar_service                |     reflect_db_objects(metadata=get_cached_metadata(), db_name=db_name)
mathesar_service                |   File "/code/mathesar/state/django.py", line 49, in reflect_db_objects
mathesar_service                |     reflect_columns_from_tables(tables, metadata=metadata)
mathesar_service                |   File "/code/mathesar/state/django.py", line 133, in reflect_columns_from_tables
mathesar_service                |     models._set_default_preview_template(table)
mathesar_service                |   File "/code/mathesar/models/base.py", line 899, in _set_default_preview_template
mathesar_service                |     preview_template = compute_default_preview_template(table)
mathesar_service                |   File "/code/mathesar/models/base.py", line 914, in compute_default_preview_template
mathesar_service                |     if column.primary_key:
mathesar_service                |   File "/code/mathesar/models/base.py", line 717, in __getattribute__
mathesar_service                |     return getattr(self._sa_column, name)
mathesar_service                |   File "/code/mathesar/models/base.py", line 711, in __getattribute__
mathesar_service                |     return super().__getattribute__(name)
mathesar_service                |   File "/code/mathesar/models/base.py", line 733, in _sa_column
mathesar_service                |     return self.table.sa_columns[self.name]
mathesar_service                |   File "/code/mathesar/models/base.py", line 367, in sa_columns
mathesar_service                |     return self._enriched_column_sa_table.columns
mathesar_service                |   File "/code/mathesar/models/base.py", line 355, in _enriched_column_sa_table
mathesar_service                |     table=self._sa_table,
mathesar_service                |   File "/code/mathesar/state/cached_property.py", line 62, in __get__
mathesar_service                |     new_value = self.original_get_fn(instance)
mathesar_service                |   File "/code/mathesar/models/base.py", line 336, in _sa_table
mathesar_service                |     sa_table = reflect_table_from_oid(
mathesar_service                |   File "/code/db/tables/operations/select.py", line 31, in reflect_table_from_oid
mathesar_service                |     tables = reflect_tables_from_oids(
mathesar_service                |   File "/code/db/tables/operations/select.py", line 52, in reflect_tables_from_oids
mathesar_service                |     table_oids_to_sa_tables[table_oid] = reflect_table(
mathesar_service                |   File "/code/db/tables/operations/select.py", line 20, in reflect_table
mathesar_service                |     return Table(
mathesar_service                |   File "<string>", line 2, in __new__
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/deprecations.py", line 298, in warned
mathesar_service                |     return fn(*args, **kwargs)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 607, in __new__
mathesar_service                |     metadata._remove_table(name, schema)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
mathesar_service                |     compat.raise_(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
mathesar_service                |     raise exception
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 602, in __new__
mathesar_service                |     table._init(name, metadata, *args, **kw)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 677, in _init
mathesar_service                |     self._autoload(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 712, in _autoload
mathesar_service                |     conn_insp.reflect_table(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 774, in reflect_table
mathesar_service                |     for col_d in self.get_columns(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 497, in get_columns
mathesar_service                |     col_defs = self.dialect.get_columns(
mathesar_service                |   File "<string>", line 2, in get_columns
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 55, in cache
mathesar_service                |     ret = fn(self, con, *args, **kw)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3701, in get_columns
mathesar_service                |     c = connection.execute(s, dict(table_oid=table_oid))
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/future/engine.py", line 280, in execute
mathesar_service                |     return self._execute_20(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1611, in _execute_20
mathesar_service                |     return meth(self, args_10style, kwargs_10style, execution_options)
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
mathesar_service                |     return connection._execute_clauseelement(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1478, in _execute_clauseelement
mathesar_service                |     ret = self._execute_context(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1842, in _execute_context
mathesar_service                |     self._handle_dbapi_exception(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2023, in _handle_dbapi_exception
mathesar_service                |     util.raise_(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
mathesar_service                |     raise exception
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1799, in _execute_context
mathesar_service                |     self.dialect.do_execute(
mathesar_service                |   File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute
mathesar_service                |     cursor.execute(statement, parameters)
mathesar_service                | sqlalchemy.exc.ProgrammingError: (psycopg2.errors.InsufficientPrivilege) permission denied for schema bulbasaurschema
mathesar_service                | 
mathesar_service                | [SQL: 
mathesar_service                |             SELECT a.attname,
mathesar_service                |               pg_catalog.format_type(a.atttypid, a.atttypmod),
mathesar_service                |               (
mathesar_service                |                 SELECT pg_catalog.pg_get_expr(d.adbin, d.adrelid)
mathesar_service                |                 FROM pg_catalog.pg_attrdef d
mathesar_service                |                 WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum
mathesar_service                |                 AND a.atthasdef
mathesar_service                |               ) AS DEFAULT,
mathesar_service                |               a.attnotnull,
mathesar_service                |               a.attrelid as table_oid,
mathesar_service                |               pgd.description as comment,
mathesar_service                |               a.attgenerated as generated,
mathesar_service                |                               (SELECT json_build_object(
mathesar_service                |                     'always', a.attidentity = 'a',
mathesar_service                |                     'start', s.seqstart,
mathesar_service                |                     'increment', s.seqincrement,
mathesar_service                |                     'minvalue', s.seqmin,
mathesar_service                |                     'maxvalue', s.seqmax,
mathesar_service                |                     'cache', s.seqcache,
mathesar_service                |                     'cycle', s.seqcycle)
mathesar_service                |                 FROM pg_catalog.pg_sequence s
mathesar_service                |                 JOIN pg_catalog.pg_class c on s.seqrelid = c."oid"
mathesar_service                |                 WHERE c.relkind = 'S'
mathesar_service                |                 AND a.attidentity != ''
mathesar_service                |                 AND s.seqrelid = pg_catalog.pg_get_serial_sequence(
mathesar_service                |                     a.attrelid::regclass::text, a.attname
mathesar_service                |                 )::regclass::oid
mathesar_service                |                 ) as identity_options                
mathesar_service                |             FROM pg_catalog.pg_attribute a
mathesar_service                |             LEFT JOIN pg_catalog.pg_description pgd ON (
mathesar_service                |                 pgd.objoid = a.attrelid AND pgd.objsubid = a.attnum)
mathesar_service                |             WHERE a.attrelid = %(table_oid)s
mathesar_service                |             AND a.attnum > 0 AND NOT a.attisdropped
mathesar_service                |             ORDER BY a.attnum
mathesar_service                |         ]
mathesar_service                | [parameters: {'table_oid': 19317}]
mathesar_service                | (Background on this error at: https://sqlalche.me/e/14/f405)
mathesar_service exited with code 1
mathemancer commented 11 months ago

More performance testing. Similar to previous testing with many columns, but now using a GCP-managed database for both the user tables and the Django DB, with Mathesar running on my local laptop. This causes our app to grind to a halt with surprisingly small scale data. For consistency, I emulated the the same methodology as above, but all parts of the app are quite slow with this setup; even parts that weren't slow for the local case.

It should be noted that this was intentionally set up to be a bit difficult for Mathesar, but psql works perfectly acceptably with the same database from the same laptop (with the same internet connection).

Single table

# columns .../tables/ .../columns/ add column Notes
15 36s 393s 69.5s Slow
30 61s 773.2s 69.5s Oh nooooo

Total columns: 24

# tables # columns .../tables/ .../columns/ add column Notes
4 6 71s 160.3s 91.5s Slow
3 8 66s 212s 84.8s Slow
2 12 56.4s 308s 75.3s Slow

Total columns: 60

# tables # columns .../tables/ .../columns/ add column Notes
6 10 139.8 256.5 105.1s Slow slow slow
5 12 134.3 348.1 159.4s Argh
seancolsen commented 11 months ago

Moved out of v0.1.4 since this doesn't block release.

seancolsen commented 10 months ago

@mathemancer and @pavish What else do we need to do before we can close this issue? If nothing, then please close.

seancolsen commented 9 months ago

I'm closing this since there do not appear to be any actionable next steps.