Closed andrewjbtw closed 1 year ago
This will also require some coordination with stage users, especially Amy and Peter Chan because they actively use or create test objects there.
Potentially could be done in conjunction with https://github.com/sul-dlss/preservation_catalog/issues/1986
argo
(MySQL) on argo-db-{qa,stage}
db/seeds.rb
can clear Solr and create a single APO and agreement, but these may not be the ones that we need/workflow/bulk/tmp/
, /workflow/bulk/
redis://argo-redis-{qa,stage}-01.stanford.edu:6379/
argo_{qa,stage}
/dor/assembly/
, /data/hydrus-files/
, /dor/workspace
redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
dor_services
(Postgres) on dor-services-app-db-{qa,stage}-a
/sdrpurl_transfer/
, /dor/
rake rabbitmq:setup
when bringing DSA back upredis://dor-services-app-redis-{qa,stage}-a.stanford.edu:6379/
rake rabbitmq:setup
when bringing DIA back upargo_{qa,stage}
/var/geoserver/local/raster/geotiff/
, /var/geomdtk/current/stage/
, /var/geomdtk/current/tmp/
, /dor/workspace/
redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
gb
(Postgres) on localhost
localhost
h2
(Postgres) on sul-h2-db-{qa,stage}-a
db/seeds.rb
creates a system user account that must be present/data/h2-files/
rake rabbitmq:setup
when bringing H2 back uplocalhost
etd
(Postgres) on localhost
/opt/app/etd/workspace/
, /opt/app/etd/etd/shared/tmp/preview/
, /opt/app/etd/etd/shared/tmp/etdSubmitWF/
localhost
preassembly
(Postgres) on localhost
/dor/assembly/
, /dor/preassembly/
localhost
pres
(Postgres) on preservation-catalog-db-{qa,stage}-a.stanford.edu
db/seeds.rb
seeds ZipEndpoint and MoabStorageRoot instances from configuration/services-disk-stage/store-100/
, /services-disk-stage/store1/
, /services-disk-stage/store2/
, /sdr-transfers/
, /services-disk-stage/sdr-transfers/
(QA-only), /services-disk-qa/
(QA-only)preservation-catalog-{qa,stage}-03.stanford.edu
/dor/export/
, /services-disk-stage/store-100/
, /services-disk-stage/store1/
, /services-disk-stage/store2/
redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
sdr
(Postgres) on localhost
/globus/sdr_ingest/{qa,stage}/uploads/
localhost
suri
(Postgres) on localhost
techmd
(Postgres) on dor-techmd-{qa,stage}-a.stanford.edu
redis://dor-techmd-{qa,stage}-a.stanford.edu:6379/
/web-archiving-stacks/
(note: contains /web-archiving-stacks/data/indexes/cdxj
, which is produced by the CDXJ indexer, and which we want to recreate)was
(Postgres) on localhost
/was_unaccessioned_data/
, /web-archiving-stacks/
(these mounts are non-prod-only)/was_unaccessioned_data/
, /web-archiving-stacks/
(these mounts are non-prod-only)redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
workflow
(Postgres) on workflow-service-db-{qa,stage}-01.stanford.edu
redis://sul-robots-redis-{qa,stage}.stanford.edu:6379/
/sdrpurl_transfer/
on DSA or not? (Andrew or Arcadia)
version
)? (Andrew)
Report
instances in ETD QA or stage envs? (Cathy)
sul_pub
use the async
adapter for background jobs in QA and stage? (Peter M.)
sul_pub
database should be retained in QA and stage? (Peter M.)
/web-archiving-stacks
and /was_unaccessioned_data
filesystems should be retained in QA and stage? (Laura, Ed, Peter C.)
/web-archiving-stacks/data/indexes/cdxj
---we want to remove this and reindex stuff to recreate the index.TBD
NOTE: This is based on the restart ordering, and has some flaws. This is a draft only.
Much of the above has been superseded by https://docs.google.com/document/d/1u1D3cFKfvhzaI6a1t7-VQTtbS27fmTXTqtfj7Esr7Rk/edit?pli=1
We did this in the workcycle that ended on October 13, 2023.
Why reset the testing environments?
The stage environment has been running continuously since 2019. The QA environment has also been running for over a year. Both have accumulated a lot of data. Both have accumulated a lot of bad data. Bad data in the testing environments is a tax on all current and future development and the tax grows every week.
We need the stage and QA environments to reasonably approximate production so that:
Some problem data is inevitable in testing environments because that's a part of testing. Either we create problem data to test the system or the function we are testing creates problem data. When the volume of problem data is small, it can either be remediated or ignored and it's not a drag on the rest of the system.
The problem with our current environments is that the volume of bad data is not small, and the cost of remediating it is also not small. Problem data
The development team, product owners, the repository manager, and anyone else who uses stage for testing sees the cost of bad data on a regular basis. For the development team and the repository manager, it's become a constant, ongoing cost every week.
We are also running out of space in the stage "preservation" storage and should not keep expanding it. SDR has no simple way to delete data on a case by case basis without generating more errors.
Since stage is not actually a preservation system, we should be able to clear everything out and start over.
Benefits of starting over
What do we need to start over?
The SDR environment has to be seeded with certain objects to get started. (See https://github.com/sul-dlss/argo/issues/1782 for earlier discussion, but that was when SDR was based on Fedora.) We would need to make a full list but in terms of data we at the least need:
It would also be nice to have the Canonical objects because they took much effort to create and they represent different types of objects for use in testing Purl and sul-embed behavior.
With the foundational objects in place, we can create other test data as needed. There will be a judgement call on how far to go beyond the absolute minimum necessary to have a functional testing environment.
Nice to have
It would be nice to make this a repeatable process but not strictly necessary.