localgovdrupal / localgov_geo

Entity for storing Geographic information, addresses, and areas; extensible types and plugable backends with defaults to start.
2 stars 1 forks source link

Data loss and config can't be deployed after running update #91

Closed andybroomfield closed 1 year ago

andybroomfield commented 1 year ago

This was brought up on technical drop in by @Adnan-cds.

On running the update for localgov_geo_update to create the geo_entities, and exporting the resulting config to deploy to a staging site. When doing so, the database update will also run and create geo_entities, but these and the fields will have different UUIDs. After the database is updated, drush cim will import configuration, but this will now fail. Once you try to import the config, the different UUIDs causes the config import to try and delete the existing geo_entities, but that will fail as there are existing migrated geo entities from the database update (Existing content cannot be deleted).

I have now managed to replicate the issue and confirm it.

We can get around this in one of two ways, either:

I was able to work around the issue by adding

      // Useful where config file for this new field storage has been already
      // exported after database update in a *development* site.
      if ($existing_geo_entity_storage_config = \Drupal::service('config.storage.sync')->read("geo_entity.geo_entity_type.{$geo_type_values['id']}")) {
        $geo_type_values['uuid'] = $existing_geo_entity_storage_config['uuid'] ?? '';
      }

This covers custom geo entities, but won't cover address or area as they are created by installing geo_entity_address / area. We could though delete those before migrating as we assume that a geo_address will then be recreated.

Edited for clarity on which step the import is failing.

Adnan-cds commented 1 year ago

Thanks a lot Andy. I couldn't have written the problem description better myself.

finnlewis commented 1 year ago

Thanks @andybroomfield and @Adnan-cds. Copying @ekes who wasn't at the Tech Drop-in when Adnan raised this.

ekes commented 1 year ago

Just to be clear in my head.

We are running updb on dev, cex committing to git. Then checkout on stage and running updb first, then cim. But updb on the second run is creating config that already exists from the cex on dev (with a different UUID).

And Andy's fix will check if there is config to be cim imported and so will apply it's UUID to the generated config?

ekes commented 1 year ago

8002 and 8003 and I think 8005 and 8807 will run fine if a cim had been done before a updb (not 8806 - but that wants fixing anyway just in case, highly unlikely, someone has already deleted the library). Don't know how to make that order fit other things, but could that also be a solution?

ekes commented 1 year ago

Don't know how to make that order fit other things, but could that also be a solution?

Sorry this is a stream of conciousness. Taking Andy's inspiration we could import the config we're bothered about from disk if it's there when starting 8001?

It's a lot of magic, but the intention of having this all in a update hooks in a module was pretty much to make it magic, and take steps of work away from documentation and automate them instead.

andybroomfield commented 1 year ago

Thanks for looking @ekes. This occurs when using drush deploy or similar, which runs drush cr; drush updb -y; drush cim -y. Acquia BLT deploy hooks do it a similar way, update the database and then import config and I'd imagine most deployment scripts are set up the same way.

ekes commented 1 year ago

updb then cim is the normal order, just can't think round a good way here where we make the config, though your copying the uuid if it's on disk is an option.

How would your deploy processes go if the first time you did a deploy the updates failed, and would continue to do so till a drush cim had been run. If I recall drupal will continue with other modules updates and just leave the module that fails updates pending? I think? Then the normal order of things would be supported (as you do normally updb then cim). It wouldn't be magic. But it would use the previously created (by running the update on another environment) configuration rather than making new. This would count for the configuration made by installing the geo entity module too - because enabling the module and importing the config would happen in the drush cim before 8001 successfully runs.

andybroomfield commented 1 year ago

I think drush updb will continue running updates allowing us to run them again. Though I'd have to check I think Acquia BLT is bit more harsher, stopping the database updates once it reaches an error. Not sure how on stage / prod we would make the update delibraritly fail. I like the solution that if the config exists on disk, import that either by assuming it's running on a stage site or place it behind a setting / environment variable.

ekes commented 1 year ago

I like the solution that if the config exists on disk, import that either by assuming it's running on a stage site or place it behind a setting / environment variable.

I think I could import the config, just it would also require enabling the modules, without importing their default config. Then I think it could run, check the sync config, import sync config if there for each step rather than generating new.

ekes commented 1 year ago

I realize the other answer to this question. Which is what I now remember from other modules. Including the upgrade I just did from Group 1.x to 2.x. You have to run the upgrade once for the site, not multiple times, which means you have to run it on a (copy of) prod.

But lets see if we can get the import sorted. Just got a crappy cold, but will hopefully feel better and look tomorrow.

finnlewis commented 1 year ago

Just to confirm the behaviour and help to test solutions, I'm testing locally with the Greenwich GCD Directory, currently D9, needs to be D10, but not yet live.

I prepared for the D10 update, then

  1. update to Drupal 10
  2. run drush updb
  3. export the config
  4. re-import the original db (to mimic deploying to staging)
  5. run drush updb

I get the following error:

 [error]  Drupal\Core\Config\ConfigImporterException: There were errors validating the config synchronization.
Entities exist of type <em class="placeholder">Geo</em> and <em class="placeholder">Geo type</em> <em class="placeholder">Address</em>. These entities need to be deleted before importing. in Drupal\Core\Config\ConfigImporter->validate() (line 814 of /var/www/html/web/core/lib/Drupal/Core/Config/ConfigImporter.php). 

In ConfigImportCommands.php line 290:

  The import failed due to the following reasons:                                                                                                                                        
  Entities exist of type <em class="placeholder">Geo</em> and <em class="placeholder">Geo type</em> <em class="placeholder">Address</em>. These entities need to be deleted before impo  
  rting.                                                                                                                                                                                 

I'll test the merge request next https://github.com/localgovdrupal/localgov_geo/pull/90

ekes commented 1 year ago

So the simplest thing here (balancing both writing code, and for the various different ways people are going to deploy this) feels like detecting if the modules are to be enabled from the sync directory in 8001 and assuming that this is a dev staging situation setting a (state) flag so all the other updates are just skipped and send a message each time that you should run drush cim and then include the deploy hook (and option drush command) to migrate the content over?

More interesting and config magic but more work, and I guess then liable to have more things that need to be tested, would be to import each of the bits of config (including that created by installing the modules).

andybroomfield commented 1 year ago

@finnlewis That might be a different issue as when you import your DB you don't erase the geo entities tables. Instrad do a truncate all tables and import your old DB. The fail will happen when you do drush cim.

finnlewis commented 1 year ago

Discussing this in Merge Monday:

@ekes is suggesting adding a flag to either run the update hooks or not.

For example:

@andybroomfield mentioned, for the content migrations in 8004, we run add these as drush_deploy hooks. @Adnan-cds : sounds fine @andybroomfield - sounds like the simplest way forward. @millnut - seems to make sense... at some stages, rather than using state, could we just check config files again? @alexburrows - sounds good.

@ekes is going to take this forward and hoping for a pull request later today! Thanks @ekes !

Adnan-cds commented 1 year ago

Figured out that the address data corruption during localgov_geo to geo_entity migration is caused by the geocoder_field submodule. I remember we had to ask it not to geocode during entity creation when we migrated from our previous Drupal 8 site to Localgov Drupal 8. Steve at the time kindly created a wiki page (see the last line) with the right instructions. Sample code for the deploy hook.

andybroomfield commented 1 year ago

I belive this issue is now resolved and can be closed?

andybroomfield commented 1 year ago

@Adnan-cds I assume its the geocoder geofield. Strangly we have never enabled it at BHCC so I was wondering what the issue was.

Screenshot 2023-10-23 at 12 20 15 pm

I saw that it's required by geo_entity_area, but in a default address install, it's set to not geocode.

Screenshot 2023-10-23 at 12 20 24 pm

Maybe we raise this as a Geo entity issue, but I don't think we should be geocoding by default unless a user whishes to, the address form already provides geocoding to the location and can reverse it to.

Adnan-cds commented 1 year ago

but in a default address install, it's set to not geocode.

In that case this is a non-issue. We have it enabled in our site for some reason. Perhaps I should disable it.

finnlewis commented 1 year ago

Discussing in Merge Monday, closing this.

Note:

The setting: Geocode method should be set to "No geocoding"

If you have it set to "Geocode from an existing field" it overrides your point on save, which is not usually what you want.