mattyschell / geodatabase-replication-toiler

Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

Resolve bad addresspoint feature class replica synchronization #10

Closed mattyschell closed 1 year ago

mattyschell commented 1 year ago

When replicating a full Enterprise Geodabase using the best practices described here we see (at least) one feature class that is not synchronizing correctly.

The consistent pattern is:

  1. OK: Add several addresspoints and synchronize parent to child
  2. OK: Delete several addresspoints and synchronize parent to child
  3. BAD: Add and address point and delete several addresspoints and synchronize parent to child

In case 3 the added point is synchronized in the child replica but the deletes are not.

mattyschell commented 1 year ago

Green is parent, brown is child. Before: image

Duplicate westernmost addresspoint 3237588 (they will be stacked on top of each other) Delete the two to the east, 3237585 and 3237587

image

After synchronization image

mattyschell commented 1 year ago

Here are the lines from the replica log for a delete that is not replicating to the child. The ESRI replica log suggests that it is deleting successfully.

(todo: compare these 5 lines for a bad delete to the lines for a good delete)

ReplicaLog-snip.txt

mattyschell commented 1 year ago

Here is a snip from the PostgreSQL log with postgresql.conf

log_statement = 'all' postgresql-2023-01-26_000000-snip.log

This snip is from the first to last mention of {98265EE6-53B9-4B47-A1BC-1243B7C65091} the global id being deleted

mattyschell commented 1 year ago

Here is the SQL that ESRI is sending to ID the row to be deleted with bind variables replaced. It looks good

select V1595.objectid, V1595.segmentid, V1595.addresspointid, V1595.complexid, V1595.house_number_suffix, V1595.hyphen_type, V1595.sosindicator, V1595.special_condition, V1595.address_source, V__1595.address_status, V1595.validation, V1595.boroughcode, V1595.zipcode, V1595.collectionmethod, V1595.created_by, V1595.created_date, V__1595.modified_by, V1595.modified_date, V1595.b7sc_actual, V__1595.b7sc_vanity, V1595.a4id, V1595.house_number, V1595.house_number_range, V1595.house_number_range_suffix, V1595.bin, V1595.globalid, V1595.doitt_exception_flag, V1595.remotebldg, V1595.subaddress_flag, st_asewkb(ST_setSRID(V1595.shape, -1)) as shape, V1595.GDB_GEOMATTR_DATA from ( select / ArcSDE NORMAL_FILTER / b.objectid, b.segmentid, b.addresspointid, b.complexid, b.house_number_suffix, b.hyphen_type, b.sosindicator, b.special_condition, b.address_source, b.address_status, b.validation, b.boroughcode, b.zipcode, b.collectionmethod, b.created_by, b.created_date, b.modified_by, b.modified_date, b.b7sc_actual, b.b7sc_vanity, b.a4id, b.house_number, b.house_number_range, b.house_number_range_suffix, b.bin, b.globalid, b.doitt_exception_flag, b.remotebldg, b.subaddress_flag, b.shape , b.GDB_GEOMATTR_DATA from pscscl.cscl.addresspoint b where not exists ( select from ( select SDE_DELETES_ROW_ID, SDE_STATE_ID from cscl.d1595, sde.sde_state_lineages l where SDE_STATE_ID = 0 and DELETED_AT = l.lineage_id and l.lineage_name = '9' and l.lineage_id <= '23' ) d where b.objectid = d.SDE_DELETES_ROW_ID ) union all select a.objectid, a.segmentid, a.addresspointid, a.complexid, a.house_number_suffix, a.hyphen_type, a.sosindicator, a.special_condition, a.address_source, a.address_status, a.validation, a.boroughcode, a.zipcode, a.collectionmethod, a.created_by, a.created_date, a.modified_by, a.modified_date, a.b7sc_actual, a.b7sc_vanity, a.a4id, a.house_number, a.house_number_range, a.house_number_range_suffix, a.bin, a.globalid, a.doitt_exception_flag, a.remotebldg, a.subaddress_flag, a.shape , a.GDB_GEOMATTR_DATA from cscl.a1595 a inner join sde.sde_state_lineages l on a.SDE_STATE_ID = l.lineage_id where not exists ( select from ( select SDE_DELETES_ROW_ID, SDE_STATE_ID from cscl.d1595, sde.sde_state_lineages l where DELETED_AT = l.lineage_id and l.lineage_name = '9' and l.lineage_id <= '23' ) d where (a.objectid = d.SDE_DELETES_ROW_ID) and (a.SDE_STATE_ID = d.SDE_STATE_ID) ) and (l.lineage_name ='9' and l.lineage_id <= '23') ) V__1595 where (globalid in('{98265EE6-53B9-4B47-A1BC-1243B7C65091}'))

Returns objectid|segmentid|addresspointid|complexid|house_number_suffix|hyphen_type|sosindicator|special_condition|address_source|address_status|validation|boroughcode|zipcode|collectionmethod|created_by|created_date |modified_by|modified_date |b7sc_actual|b7sc_vanity|a4id|house_number|house_number_range|house_number_range_suffix|bin |globalid |doitt_exception_flag|remotebldg|subaddress_flag|shape |gdb_geomattr_data --------+---------+--------------+---------+-------------------+-----------+------------+-----------------+--------------+--------------+----------+-----------+-------+----------------+----------+-----------------------+-----------+-----------------------+-----------+-----------+----+------------+------------------+-------------------------+-------+--------------------------------------+--------------------+----------+---------------+---------------------+----------------- 1009731| 19283| 3247691| | |N |1 | |1 | |2 |3 |11223 |F |CSCL |2009-02-13 00:00:00.000|csarkissian|2015-06-02 10:22:58.000|39113001 | | |2565 | | |3195638|{98265EE6-53B9-4B47-A1BC-1243B7C65091}| | |N | 0.A ø? A|

mattyschell commented 1 year ago

Here's the line from the PosgreSQL log where the delete should be inserted into the deletes table

2023-01-26 10:27:52.639 EST [3016] LOG: execute sde_1674746872_14235_162: INSERT INTO cscl.d1595 (SDE_DELETES_ROW_ID,SDE_STATE_ID,DELETED_AT) VALUES ($1,$2,$3) 2023-01-26 10:27:52.639 EST [3016] DETAIL: parameters: $1 = '1009731', $2 = '0', $3 = '23'

That record does not hit the deletes table

image

mattyschell commented 1 year ago

From ESRI support:

I do have one concern noted below regarding version compatibility.

Esri Comments: • It looks like our versions should all be compatible expect for the PostGIS version. It looks like we are on PostGIS 2.5. • Since our Postgres EGDB is 10.8.0.2.5, this would require us to use PostGIS 3.0.1. I am providing our documentation below for our reference. o I am reviewing the subsection "ArcGIS 10.8 and Pro 2.5." System Requirements/Postgres and ArcGIS: • https://desktop.arcgis.com/en/system-requirements/latest/database-requirements-postgresql.htm Questions: • Is there any reason we are on PostGIS 2.5.1? • Do we have an instance where PostGIS 3.0.1 is installed? If so, are we able to replicate and synchronize the data within this environment successfully? I think getting us at a supported version of PostGIS may need to be our first step moving forward. Please let me know if we still want to hop on a call today as I provided my zoom details below.

mattyschell commented 1 year ago

• Is there any reason we are on PostGIS 2.5.1? We are on PostGIS 2.5.1 because in our production environment the child geodatabase is on Azure Database for PostgreSQL – Single Server, PostgreSQL version 11. The postgis extension available on that platform is 2.5.1
https://learn.microsoft.com/en-us/azure/postgresql/single-server/concepts-extensions#postgres-11-extensions • Do we have an instance where PostGIS 3.0.1 is installed? If so, are we able to replicate and synchronize the data within this environment successfully? Not currently. If setting up another sandbox environment is the only way to move forward I think we would need to install a different Enterprise Geodatabase version instead of a different PostGIS version.

We set up an identical environment to the one above at the top of this issue. The only difference is enterprise geodatabase 10.7.1.

mattyschell commented 1 year ago

Same results with an Enterprise Geodatabase at 10.7.1

  1. We will add 2 addresspoints and synchronize parent to child
  2. We will delete 2 addresspoints and synchronize parent to child
  3. We will and 1 address point and delete 2 addresspoints and synchronize parent to child

Before:

image

After:

image

mattyschell commented 1 year ago

Attribute synchronizations seem to be OK.

Question from ESRI Support:

I do not see this behavior with simple attribute changes.

mattyschell commented 1 year ago

Suggestion from ESRI support

I have thought about us attempting to rebuild the spatial indexes of the data if we have not done so already.

https://desktop.arcgis.com/en/arcmap/latest/manage-data/geodatabases/modifying-a-spatial-index.htm

mattyschell commented 1 year ago

From the AddressPoint feature class properties on the child PostgreSQL database in ArcMap 10.7.1's catalog pane I selected layer properties and the index tab. Then "Delete" apply and "Create" apply.

This did not change replication to the child. Adding a point before deleting still results in the deletes not being replicated.

image

mattyschell commented 1 year ago

ESRI support has requested a full database export of the parent Oracle database and child postgres database. The Oracle DBAs in our shop do not allow me to use datapump so this step requires a ticket and, probably, waiting a while.

mattyschell commented 1 year ago

We submitted full database exports to ESRI support on Tuesday Feb 21st.

mattyschell commented 1 year ago

ESRI support reported on Feb 27 that they successfully imported both databases and after stripping the "ArcObject components from both geodatabases" the behavior "is not reproduced."

mattyschell commented 1 year ago

A colleague answered some questions about the cscl class extensions arcobjects code in the source parent geodatabase.

mattyschell commented 1 year ago

ESRI support reports that with the custom arcobjects class extensions the bad behavior reproduces. Without the arcobjects class extensions all ESRI geodatabase and ESRI replication software works as expected. So there is not much more that ESRI support can do.

When considering next steps

mattyschell commented 1 year ago

Let's take stock of the three most obvious workarounds.

  1. Abandon PostgreSQL in Azure

The downstream user of the child replica in this case can also connect to the parent on-premise geodatabase.

Pros:

Cons:

  1. Abandon ESRI replication

Develop a procedure to daisy chain the parent geodatabase to the target PostgreSQL database without ESRI replication. I think the easiest path would be to load a CSCL file geodatabase to a local PostgreSQL and pg_dump the entire thing to Azure.

Pros

Cons

  1. Use FME change detection instead of ESRI replication

I do not know enough about the peculiarities of reading the source to go into details on this. But my understanding is that FME must use some sort of 32-bit desktop reader in order to access the source where the class extensions are installed and geometries are stored as SDELOB.

mrahman-doitt commented 1 year ago
  1. Option to "connect to the parent on-premise geodatabase" may not be idea/work as this DB has 32-bit Class Extensions and copying data to a FGDB also copy the Class Extensions.
  2. May work but in addition to what you mentioned as Cons, only likes of Matt Schell has skills to manage this.
  3. no one in OTI team is competence enough to pull this off now.
  4. May I offer option 4? Switch to SQL Server!
mattyschell commented 1 year ago

Revised options, at a glance in slightly different order:

  1. Consumer connects directly to the on-premise live enterprise database.

This option is almost impossible to implement with the compatible technology available to us.

  1. FME change detection replacing ESRI replication.

Similar to 1 this option is almost impossible to implement with the available technology and skillsets.

  1. Daisy chain with minimal ESRI: from live database to scratch postgis database to dump file loaded to Azure

This option is possible but fairly risky in terms of both implementation time and maintenance.

  1. Daisy chain with maximum ESRI: Replicate to a file geodatabase without class extensions and replicate from the file geodatabase to PostgreSQL in Azure.

This one looks to be in the sweet spot with fairly low effort ultimately getting us to the same result, though we should note that we have not tested it.

  1. Plunk a stupid simple file geodatabase somewhere in Azure

This one is stupid simple but not very enterprisey.

  1. Switch to SQLServer in Azure

This one is technically possible but requires coordination with several teams and is therefore risky from an implementation perspective.

mattyschell commented 1 year ago

replication-replacement-pics.pdf

mattyschell commented 1 year ago

We agreed with ESRI to close the case on March 7th 2023.