Azure / spark-cdm-connector

MIT License
76 stars 33 forks source link

IMPORTANT NOTICE: Upgrade your CDM Connector version #162

Closed kecheung closed 7 months ago

kecheung commented 7 months ago

The CDM library, which this connector is reliant on, is deprecating the CDM Schema Store. Please upgrade your connector version to the latest version spark3.3-1.19.7 to ensure there is no disruption in your workflows.

Reference: https://github.com/microsoft/CDM?tab=readme-ov-file#important-notice

Spark 3.3 Update CDM library to 1.7.3 (#159)

Changes:

  1. Update the CDM objectmodel library to version 1.7.3. Higher versions are built in Java 9 and doesn't work because Spark 3.3 uses Java 8.
  2. CDM content delivery network is being shutdown. Do not rely on the previously provided entity definitions that would have been provided by the CDN. Create your own entity definition/schema, meaning don't use entityDefinitionPath or useCdmStandardModelRoot.

Bugs and Workarounds:

  1. OverridenCdmStandardsAdapter extends the CdmStandardsAdapter because the StorageAdapterBase base class has a bug where backwards compatibility is broken when mounting from a config. After mounting from config, you have to mount the new class. Future versions of the CDM library will have this bug fixed and this workaround will not be needed.
    cdmCorpus.getStorage.mountFromConfig(config)
    cdmCorpus.getStorage.mount("cdm", new OverridenCdmStandardsAdapter)
  2. Predefined entity write doesn't work with the useSubManifest option. Workaround: As already mentioned in point 2 of the change, don't use predefined entity definitions.