pressbooks / pressbooks

Open publishing. Open web. Open source.
https://pressbooks.org/
GNU General Public License v3.0
415 stars 132 forks source link

Improve processing of base64 strings upon import #2129

Closed SteelWagstaff closed 3 years ago

SteelWagstaff commented 3 years ago

An open source user has opened a simple PR in Pressbooks: https://github.com/pressbooks/pressbooks/pull/2107

This PR attempts to bypass code that (unsuccessfully) tries to download an image if the src attribute is a base64-encoded string. Current behaviour:

  1. The importer tries to download the image using fetchAndSaveUniqueImage.
  2. This fails because $src_old is not a URL.
  3. fetchAndSaveUniqueImage adds #fixme to the end of the image src attribute, indicating that it couldn't download the image.
  4. The base64 value no longer works as an image src.

Proposed behaviour:

  1. If $src_old is a base64 string, nothing is downloaded and the image src is not modified.

In both cases, the entirety of the base64 string is stored in the database after the import— the only difference in the new behaviour is that #fixme is not appended to it.

This PR lacks a unit test. Ned suggests that a check be added to ensure that this process specifically looks for base64 images, e.g. "data:image/png;base64" or "data:image/jpeg;base64". As the check is written right now it isn't checking specifically for images. Furthermore, the current PR uses str_contains() a function that's only available in PHP8. Since Pressbooks requires PHP7.3, we should prefer a compatible method, like https://www.php.net/manual/en/function.str-contains.php#125977

SteelWagstaff commented 3 years ago

Fixed by https://github.com/pressbooks/pressbooks/pull/2132