az-digital / az_quickstart

UArizona's web content management system built with Drupal 10.
https://quickstart.arizona.edu
GNU General Public License v2.0
30 stars 20 forks source link

Text with certain special characters is improperly migrated #1534

Closed joshuasosa closed 2 years ago

joshuasosa commented 2 years ago

Problem/Motivation

UAQS pages may have text that contains special characters, some of which gets imported improperly.

Describe the bug

The particular case I found is a UAQS Flexible Page with a Text area paragraph that contains a registered mark ® (e.g., Spacewatch®) results in it being migrated to an AZQS Page with a Text paragraph that replaces that registered mark with ® (e.g., Spacewatch®),

In comparison, basic pages or other custom content types that were created in Drupal 7 and migrated using Drupal 9's built-in migration tools get imported with correct text.

joshuasosa commented 2 years ago

More examples:

Long example:

Voigt JRC†, CW Hamilton, G Steinbrügge, Á Höskuldsson, I Jónsdottir, and T Thordarson (accepted) Linking lava morphologies to effusion rates for the 2014–2015 Holuhraun lava flow-field, Iceland, Geology

became:

Voigt JRC†, CW Hamilton, G Steinbrügge, Á Höskuldsson, I Jónsdottir, and T Thordarson (accepted) Linking lava morphologies to effusion rates for the 2014–2015 Holuhraun lava flow-field, Iceland, Geology

Seems like the migration script is currently not multibyte safe.

joshuasosa commented 2 years ago

This issue still occurs for me on current migrations.

joshuasosa commented 2 years ago

Scratch my last comment, my last major migration was done just before the last AZQS update. I re-tested with the same data and confirm issues are resolved. Thanks!