tripal / tripal_doc

Official Documentation for the Tripal Platform
https://tripaldoc.readthedocs.io/en/latest/
GNU General Public License v3.0
2 stars 3 forks source link

Notes on how to update an existing Tripal site to Tripal4 using the move chado approach. #82

Closed laceysanderson closed 2 months ago

laceysanderson commented 5 months ago

Pull Request #88

Currently the way I am working to upgrade KnowPulse to Tripal 4 involves

  1. create a fresh Tripal 4 site, but use a name for your chado schema that is different than your existing site, like tempchado or teacup

  2. Use pg_dump to export your chado out of your existing Tripal 3 site.

    pg_dump CONNECTION_INFORMATION --schema="chado" \
    --format=plain --no-owner --no-privileges --compress=9 \
    DATABASENAME > chado.sql.gz
  3. Then applying the chado database dump to your new Tripal 4 site.

    gunzip -c chado.sql.gz | psql CONNECTION_INFORMATION TRIPAL4_DATABASE
  4. The next step is to check that your imported existing chado matches what Tripal4 expects as far as cvterms go. This can be done using the drush trp-check-terms --chado_schema=chado command that's currently in PR https://github.com/tripal/tripal/pull/1895.

  5. Once that command tells you there are no errors with your cvterm setup, then you want to prepare your chado instance by going to TRIPAL4-SITE/admin/tripal/storage/chado/prepare.

  6. Now go into your Tripal 4 site and set the newly imported and prepared chado to be your default chado. Go to http://TRIPAL4-WEBSITE/admin/tripal/storage/chado/manager and click the "Add to Tripal" button and then the "Set Default" button. Optionally you can drop the temporary Chado schema at this point.

  7. If this works then you want to import content types and find fields so that you can start configuring your content types :-)

  8. See this comment https://github.com/tripal/tripal_doc/issues/82#issuecomment-2178994078 if you want to reserve your existing entity id range.

  9. Publish all of your content.

The plan is to add a command in the future that help pull over url alias' from your Drupal 7 site for existing pages.

dsenalik commented 5 months ago

In regards to step 3, if you encounter this error:

...
CREATE INDEX
CREATE INDEX
CREATE INDEX
ERROR: data type bigint has no default operator class for access method "gist"
HINT: You must specify an operator class for the index or define a default operator class for the data type.

You can fix it by running this command before loading:

sitedb=> CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE EXTENSION

No actual harm is done if the extension already exists:

sitedb=> CREATE EXTENSION IF NOT EXISTS btree_gist;
NOTICE:  extension "btree_gist" already exists, skipping
CREATE EXTENSION
dsenalik commented 5 months ago

I have a few existing entities that are linked to by external sources. I will probably want to keep the block of existing entity_id values unused so that later a reference can be made and it is not used by something else. So to alter the first available entity_id, I am testing this

  1. Get the last entity id on your Tripal 3 site with SELECT NEXTVAL('tripal_entity_id_seq');
    nextval 
    ---------
    123456
    (1 row)
  2. On your Tripal 4 site, set it with ALTER SEQUENCE tripal_entity_id_seq RESTART 123456;
laceysanderson commented 4 months ago

@pdtouch, this is the issue I was referring to. You can use the same approach to move your chado instance from Tripal 2 to Tripal 4 as what is described here. Just make sure your chado is version 1.3 It can have extra tables and columns but those tables/columns that are in chado 1.3 must match the chado spec exactly.

Ferrisx4 commented 4 months ago

After running the trp-check-terms drush command and fixing the various issues that it suggested (either automatically through its functionality or manually), tried the prepare step and got the following errors (sequentially):

SQLSTATE[23505]: Unique violation: 7 ERROR:  duplicate key value violates 
unique constraint "cvterm_c1"
DETAIL:  Key (name, cv_id, is_obsolete)=(is a, 24, 0) already exists.

This error repeated itself every time I tried the prepare step. Each time, I'd manually go in and set the offending row to is_obsolete=1 and try again. After a few times I noticed it only happened with the synonym_type and tripal_contact CVs

I don't know if this is a case that should be caught by the trp-check-terms command or if the Chado prepare step needs to be modified so it skips existing entries.