georchestra / geonetwork

The official GeoNetwork fork for geOrchestra
GNU General Public License v2.0
9 stars 21 forks source link

improve upgrade path from 3.8 ? #205

Closed landryb closed 9 months ago

landryb commented 2 years ago

It seems there are many unfixed issues wrt upgrade paths to 4.0, eg geonetwork/core-geonetwork#6054 geonetwork/core-geonetwork#6055 geonetwork/core-geonetwork#6058 - maybe those should be investigated upstream/merged/etc before the 4.0.7 mentioned in https://github.com/georchestra/georchestra/issues/3713#issuecomment-1092613877

fvanderbiest commented 2 years ago

Yep, the community would welcome funding/contributions for/to these issues with great pleasure :-)

landryb commented 2 years ago

Just for reference: trying to migrate a 3.8.2 dev db (with lots of records/users) to 4.0.6 badly blows on the 3.11 step:

2022-07-16 06:27:42,803 INFO  [geonetwork.databasemigration] -        - running tasks for 3.11.0...
2022-07-16 06:27:42,804 INFO  [geonetwork.databasemigration] -          - Java migration class:v3110.UpdateMetadataStatus
2022-07-16 06:27:46,937 ERROR [geonetwork.databasemigration] -           Errors occurs during Java migration file: org.hibernate.TransientPropertyValueException: object references an
 unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue; nested exception is
 java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue
...
        at org.springframework.orm.jpa.EntityManagerFactoryUtils.convertJpaAccessExceptionIfPossible(EntityManagerFactoryUtils.java:371)
        at org.springframework.orm.jpa.vendor.HibernateJpaDialect.translateExceptionIfPossible(HibernateJpaDialect.java:257)
        at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:538)
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:743)
        at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:711)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:631)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:385)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:118)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.data.jpa.repository.support.CrudMethodMetadataPostProcessor$CrudMethodMetadataPopulatingMethodInterceptor.invoke(CrudMethodMetadataPostProcessor.java:178)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
        at com.sun.proxy.$Proxy225.save(Unknown Source)
        at v3110.UpdateMetadataStatus.updateOtherNewFields(UpdateMetadataStatus.java:251)
        at v3110.UpdateMetadataStatus.update(UpdateMetadataStatus.java:96)

then the process tries to run the sql scripts which also fail on a different thing ?

2022-07-16 06:27:46,950 INFO  [geonetwork.databasemigration] -          - SQL migration file:/data/tomcat/geonetwork/webapps/geocat/WEB-INF/classes/setup/sql/migrate/v3110 prefix:migrate- ...
2022-07-16 06:27:46,954 INFO  [geonetwork.databasemigration] -           Errors occurs during SQL migration file: ERROR: column "editable" of relation "settings" does not exist
  Position : 22

the second error definitely points at geonetwork/core-geonetwork#6054, as for the first one ... shrug.

landryb commented 2 years ago

fixing/working around geonetwork/core-geonetwork#6054 this way:

[db3:5432] geonetwork@geonetwork=> ALTER TABLE Settings ADD COLUMN editable VARCHAR(1) DEFAULT 'y';

a second run of the migration steps still fails on UpdateMetadataStatus,

2022-07-29 13:30:16,170 INFO  [geonetwork.databasemigration] -        - running tasks for 3.11.0...
2022-07-29 13:30:16,171 INFO  [geonetwork.databasemigration] -          - Java migration class:v3110.UpdateMetadataStatus
2022-07-29 13:30:16,245 ERROR [geonetwork.database] -   Exception while adding new id column to MetadataStatus. Error is: ERROR: column "id" of relation "metadatastatus" already exists
2022-07-29 13:30:16,247 ERROR [geonetwork.database] -   Exception while adding new uuid column to MetadataStatus. Error is: ERROR: current transaction is aborted, commands ignored until end of transaction block
2022-07-29 13:30:18,014 ERROR [geonetwork.databasemigration] -           Errors occurs during Java migration file: org.hibernate.TransientPropertyValueException: object references an
 unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue; nested exception is
 java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing : org.f
ao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue
2022-07-29 13:30:18,015 ERROR [geonetwork.databasemigration] - org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient ins
tance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue; nested exception is java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue
org.springframework.dao.InvalidDataAccessApiUsageException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue; nested exception is java.lang.IllegalStateException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing : org.fao.geonet.domain.MetadataStatus.statusValue -> org.fao.geonet.domain.StatusValue

but goes further and is now apparently stuck (eg no progress, java process takes 25% constant cpu since 10mn) on

2022-07-29 13:30:18,220 INFO  [geonetwork.databasemigration] -        - running tasks for 4.0.0...
2022-07-29 13:30:18,220 INFO  [geonetwork.databasemigration] -          - Java migration class:v400.UpdateAllSequenceValueToMax

guess i'll turn Geonet.DB logging to debug level and retry....

landryb commented 2 years ago

oh in fact it's not "stuck", its just iterating super slowly on a sequence, from my understanding of @fxprunayre 's comment in https://github.com/geonetwork/core-geonetwork/pull/4781#issuecomment-715031505

luckily there's an alternative for psql in https://github.com/geonetwork/core-geonetwork/pull/5003#issuecomment-690188130 because with an hibernate_sequence at 185370214 im not going to let this iteration run for days... so i guess i'll manually run the sql to update sequences, then drop WEB-INF/classes/v400/UpdateAllSequenceValueToMax.class

as for the 'transient instance' on v3110/UpdateMetadataStatus i'm puzzled. Havent found anything related in geonetwork issues.

for the other insane ones who would want to debug this, protip: since the migrations happen before the logging config is applied, no point in changing debug levels in /etc/georchestra/geonetwork/log4j/log4j.xml, rather modify WEB-INF/classes/log4j.xml in the webapp. And then finally you can get debug logs from the java migration bits:

2022-07-29 16:15:30,031 DEBUG [geonetwork.database] - UpdateAllSequenceValueToMax
2022-07-29 16:15:30,074 DEBUG [geonetwork.database] - Scanning bean org.fao.geonet.domain.MessageProducerEntity
2022-07-29 16:15:30,077 DEBUG [geonetwork.database] -   Table MessageProducerEntity does not have any data. Skipping sequence message_producer_entity_id_seq update
2022-07-29 16:15:30,077 DEBUG [geonetwork.database] - Scanning bean org.fao.geonet.domain.Schematron
2022-07-29 16:15:30,100 DEBUG [geonetwork.database] -   Max id for table schematron : 185363908. Related sequence: schematron_id_seq
fvanderbiest commented 2 years ago

for the other insane ones who would want to debug this, protip: since the migrations happen before the logging config is applied, no point in changing debug levels in /etc/georchestra/geonetwork/log4j/log4j.xml, rather modify WEB-INF/classes/log4j.xml in the webapp

Awesome tip ! Thx Landry.

landryb commented 2 years ago

another protip: beware, the java migration classes are present in multiple places, source and compiled:

note that apparently they're also referenced/called from the list in WEB-INF/config-db/database_migration.xml

landryb commented 2 years ago

another lol on the upgrade path - don't forget about https://github.com/georchestra/georchestra/issues/3545#issuecomment-983773980 - apparently this was forgotten in the upgrade/migration notes.

fvanderbiest commented 2 years ago

apparently this was forgotten in the upgrade/migration notes

https://github.com/georchestra/georchestra/tree/master/migrations/22.0#authtype ?

landryb commented 2 years ago

apparently this was forgotten in the upgrade/migration notes

https://github.com/georchestra/georchestra/tree/master/migrations/22.0#authtype ?

too many places to look for, i missed that one :)

pmauduit commented 2 years ago

another protip: beware, the java migration classes are present in multiple places, source and compiled

This should not be the case. I checked into the docker image & the generic geonetwork.war provided by packages.georchestra.org, and:

pmauduit commented 2 years ago

Stumbled upon an issue with the v3110/UpdateMetadataStatus java class today while attempting a migration. It turns out that the class grabs a DraftMetadataUtils from the application Context, but this one is not instanciated yet, having all its members set to null. One had to force autowiring, in order to have the object functional:

https://github.com/georchestra/geonetwork/blob/georchestra-gn4-4.0.6/web/src/main/webapp/WEB-INF/classes/setup/sql/migrate/v3110/UpdateMetadataStatus.java#L70 Adding the following lines:

import org.springframework.beans.factory.config.AutowireCapableBeanFactory;

[...]
     public void setContext(ApplicationContext applicationContext)  {
        super.setContext(applicationContext);
        metadataUtils = applicationContext.getBean(IMetadataUtils.class);
        AutowireCapableBeanFactory acbf = applicationContext.getAutowireCapableBeanFactory();
        acbf.autowireBean(metadataUtils);

not tested at runtime yet.

pmauduit commented 2 years ago

Next errors in my migration process are the following:

2022-10-12 12:28:25,577 ERROR [geonetwork.database] -   Exception while adding new id column to MetadataStatus. Error is: ERROR: column "id" of relation "metadatastatus" already exists
2022-10-12 12:28:25,579 ERROR [geonetwork.database] -   Exception while adding new uuid column to MetadataStatus. Error is: ERROR: current transaction is aborted, commands ignored until end of transaction block
2022-10-12 12:28:33,781 ERROR [geonetwork.database] -   Exception while dropping old primary key on table MetadataStatus. Restart application and check logs for database errors.  If errors exists then may need to manually drop the primary key for this table. Error is: ERROR: syntax error at or near "primary"
  Position: 33
2022-10-12 12:28:33,789 INFO  [geonetwork.databasemigration] -          - SQL migration file:/var/lib/jetty/webapps/geonetwork/WEB-INF/classes/setup/sql/migrate/v3110 prefix:migrate- ...
2022-10-12 12:28:33,820 INFO  [geonetwork.databasemigration] -           Errors occurs during SQL migration file: ERROR: column "editable" of relation "settings" does not exist
  Position: 22
2022-10-12 12:28:33,821 ERROR [geonetwork.databasemigration] - ERROR: column "editable" of relation "settings" does not exist
  Position: 22
org.postgresql.util.PSQLException: ERROR: column "editable" of relation "settings" does not exist
  Position: 22
landryb commented 2 years ago

the editable thing is filed upstream as geonetwork/core-geonetwork#6054

pmauduit commented 2 years ago

When I load the GN UI, I can see that it did not start for the following reason, which is unrelated with the previous migration errors:

OperationAbortedEx : Failed whilst adding the schema information. Exception message if any is could not execute batch; SQL [insert into schematron (displayPriority, filename, schemaName, id) values (?, ?, ?, ?)]; constraint [schematron_pkey]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute batch
    at org.fao.geonet.kernel.SchemaManager.processSchema(SchemaManager.java:1312)
[...]
Caused by: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "schematron_pkey"
  Detail: Key (id)=(100) already exists.

weirdly, when debugging, I had to relaunch the service several times, and turned out that one of the relaunch actually succeeded, yet I hadn't change anything.

pmauduit commented 2 years ago

I had to try a migration from 3.8 to 4.0.6, and stumbled upon the same issues than the ones described above, plus the one (that I already had but not documented here) with encrypted password using jasypt. One workaround is the following:

update settings set encrypted='n' where encrypted='y';

And related upstream issue is the following: https://github.com/geonetwork/core-geonetwork/issues/5863.

juanluisrp commented 1 year ago

We have added this for dealing with settings already present in GN 3: https://github.com/geonetwork/core-geonetwork/pull/6614. However is not in 4.0.x branch, just in 4.2.x

And this to improve the sequences migration speed at least for postgres: https://github.com/geonetwork/core-geonetwork/pull/6678

landryb commented 9 months ago

was somehow fixed/improved in #211 and the upgrade path documented a bit in georchestra/georchestra#3978