MarquezProject / marquez

Collect, aggregate, and visualize a data ecosystem's metadata
https://marquezproject.ai
Apache License 2.0
1.78k stars 320 forks source link

Support a read-only mode #2982

Open davidjgoss opened 2 days ago

davidjgoss commented 2 days ago

It can sometimes be desirable to run Marquez in a read-only mode. For example, if you have a read replica of your database and want to scale querying separately from ingestion, you might deploy one or more instances of Marquez pointed at your reader. Currently if you do this you'll an error starting with:

org.jdbi.v3.core.statement.UnableToExecuteStatementException: org.postgresql.util.PSQLException: ERROR: cannot execute INSERT in a read-only transaction

This is because Marquez does an upsert operation for the default namespace at startup. So for a true reader to work we need a way to disable this. It would also make sense to disable non-GET endpoints.

One idea discussed was to move the namespace creation to a Flyway script. However, we decided on adding a new readOnly flag in the Marquez config file, which will drive the above mentioned changes in behaviour.