when I send a personalised mail to many users with the same attachment, the attachment file will be duplicated for each mail and use a large amount of disk space when sent to thousands.
expected solution
have only one Attachment object for a single file linked by many Emails.
workaround as of now
run script to detect and consolidate frequently:
import hashlib
import os
for a in Attachment.objects.all():
attachments = Attachment.objects.filter(name=a.name).exclude(pk=a.pk)
if attachments.count() > 1:
md5 = hashlib.md5()
if not os.path.exists(a.file.path):
continue
md5.update(a.file.file.read())
hash0 = md5.hexdigest()
for attachment in attachments:
md5a = hashlib.md5()
md5a.update(attachment.file.file.read())
hash = md5a.hexdigest()
if hash0 == hash and attachment.name == a.name:
print(f"{attachment} ({attachment.pk}) is duplicate of {a} ({a.pk})")
for email in attachment.emails.all():
print(f"for {email.pk} add {a} ({a.pk}) and delete {attachment} ({attachment.pk})")
if os.path.exists(attachment.file.path):
os.remove(attachment.file.path)
email.attachments.add(a)
if attachment.id:
attachment.delete()
symptom
when I send a personalised mail to many users with the same attachment, the attachment file will be duplicated for each mail and use a large amount of disk space when sent to thousands.
expected solution
have only one
Attachment
object for a single file linked by manyEmail
s.workaround as of now
run script to detect and consolidate frequently: