harvard-lil / h2o

H2O is a web app for creating and reading open educational resources, primarily in the legal field
https://opencasebook.org
GNU Affero General Public License v3.0
35 stars 30 forks source link

Fix export of newly-uploaded images #2055

Closed rebeccacremona closed 1 month ago

rebeccacremona commented 1 month ago

I think this PR arranges so that images newly uploaded by verified professors to casebooks will appear in DOCX exports of those casebooks.

It turns off all the URL munging that TinyMCE, the rich text editor, does on save. It is unclear to me why it munges URLs by default. The docs say:

This option enables you to control whether TinyMCE is to be smart and restore URLs to their original values. URLs are automatically converted (messed up) by default because the browser’s built-in logic works this way. There is no way to get the real URL unless you store it away.

But I do not know what that means.

The route that handles uploads returns the absolute URL of the upload: so, without any munging, we should get what we want.

Testing locally manually, it works. Adding a functional test to actually try it out as part of the test suite is way hard, so I'm not going to, at this point.

If for some reason this does not work in prod the same way it does locally, we can try again, this time configuring TinyMCE to use custom logic to force all image URLs to be absolute links to H2O, while leaving non-image links untouched. Fingers crossed it doesn't come to that 🤞.

I added a basic test so that I could feel less worried that we are exposed to exfiltration, or similar, if relative paths are manually added to the TextBlock source by accident or by a mischievous user.

As a followup, I plan to spend a little time finding all the image tags in H2O casebooks text, and a) clean up any that I can, so that their export works retroactively, and b) possibly, deal with any other bad ones that turn up.

I am resisting the urge to do any further refactoring... which is extremely tempting.

rebeccacremona commented 1 month ago

Note that images in headnotes will still not export: they are (evidently) intentionally stripped: https://github.com/harvard-lil/h2o/blob/develop/web/main/templates/export/casebook.html#L10

teovin commented 1 month ago

Looks good to me, and works as expected in my local.