Closed Nigel2392 closed 6 months ago
We have a custom model inheriting from AbstractImage
:
class Asset(AbstractImage):
file = models.ImageField(
verbose_name=_('file'),
upload_to=settings.ASSET_UPLOAD_PREFIX,
width_field='width',
height_field='height',
storage=MediaStorage()
)
MediaStorage
here is using django-gcloud-storage
When I upload an image I get the following error:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
Which is weird as I am uploading a png or jpeg image.
After further investigation and help from Wagtail Slack channel I think the problem is arising when willow is guessing the wrong file type /extension in https://github.com/wagtail/Willow/blob/830aa3d386fd2ef2aa48c8032ba7d62bf2e04fc1/willow/image.py#L86-L87
The code blame shows that this part of the code was added last year. I am upgrading from Wagtail 4.2 to Wagtail 5, which would explain why it never occurred before.
When I run the following:
import filetype
from wagtail.images import get_image_model
Image = get_image_model()
image = Image.objects.get(file="dev/chair-unsplash.jpg")
with image.open_file() as image_file:
ext = filetype.guess_extension(image_file)
print(f"Image {image.pk} {image.file} has extension {ext}")
I get Image 149277 dev/chair-unsplash.jpg has extension None
My guess is that it is unable to guess the correct file type for the BLOB when opening a file from gcloud storage.
Issue Summary
User is getting the following error:
This error likely originates from here:
https://github.com/wagtail/Willow/blob/830aa3d386fd2ef2aa48c8032ba7d62bf2e04fc1/willow/image.py#L86-L87
(mimetypes did guess the right type, hurahh)
As of right now for me; I cannot reproduce it. I'm only here to suggest an improvement to keep this from happening in the future.
You can see we are trying to infer the filetype from the contents first; and then just checking if it's none and maybe XML. We should probably fallback on file extensions first. If someone tampers with that - errors are to be expected.
https://github.com/wagtail/Willow/blob/830aa3d386fd2ef2aa48c8032ba7d62bf2e04fc1/willow/image.py#L82-L99
Technical details
Mentioned in quote. Don't know how long this will be available for; but 90 days should be enough to resolve this issue. Though it might be the reason for this issue report - we should have generally provided a better fallback IMO.
Slack Thread