ietf-tools / datatracker

The day-to-day front-end to the IETF database for people who work on IETF standards.
https://datatracker.ietf.org
BSD 3-Clause "New" or "Revised" License
607 stars 372 forks source link

Duplicate email addresses in new version notification email. #8028

Open kesara opened 1 month ago

kesara commented 1 month ago

Describe the issue

When the author has a non-Latin name (See #8027), DT's new version notification email has duplicate email addresses with names in two different scripts: the ASCII name and the UNICODE (non-Latin) name.

Example:

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: internet-drafts@ietf.org
To: 
 "=?utf-8?b?4Laa4LeZ4LeD4La7IOC2seC3j+C2seC3j+C2uuC2muC3iuC2muC3j+C2uyDgtrvgtq3gt4rgtrHgt4/gtrrgtpo=?="
 <redacted@example.org>, "Kesara Rathnayake" <redacted@example.org>
Subject: New Version Notification for draft-rathnayake-xml2rfc-unicode-01.txt

Code of Conduct

jennifer-richards commented 1 month ago

The issue is in mailtrigger:

>>> from ietf.submit.models import Submission
>>> from ietf.mailtrigger.utils import gather_address_lists
>>> subm = Submission.objects.filter(name="draft-rathnayake-xml2rfc-unicode").last()
>>> gather_address_lists("sub_announced_to_authors", submission=subm)
AddrLists(to=['=?utf-8?b?4Laa4LeZ4LeD4La7IOC2seC3j+C2seC3j+C2uuC2muC3iuC2muC3j+C2uyDgtrvgtq3gt4rgtrHgt4/gtrrgtpo=?= <redacted@example.org>', 'Kesara Rathnayake <redacted@example.org>'], cc=[])
jennifer-richards commented 1 month ago

The non-Latin script is, I believe, a red herring 🎣

The issue is that the sub_announced_to_authors mailtrigger includes the submission_authors and submission_confirmers recipients. These are defined by Recipient.gather_... methods here and here. The gather_submission_authors() is returning the address with the non-Latin name and gather_submission_confirmers() is returning the address with the Latin name.

rjsparks commented 1 week ago

gather_submission_authors is taking information from the Submission.authors (json) field. gather_submission_confirmers is taking information from Email.formatted_email

rjsparks commented 1 week ago

closing this as wontfix - the number of ways someone can hand us a different name at different places in the process is large and the current situation makes sure that all of them are made available to all the recipients of the triggered email.

rjsparks commented 1 week ago

After discussing #7167, I'm reopening this as a bug in how we capture author names from xml in the Submission object.

We've discussed possibly changing ietf.utils.xmldraft.XMLDraft.render_author_name to return a constructed "X (Y)" string if fullname and asciiFullname in the xml don't match. Or we may choose to capture them as separate strings and combine them later. Either is going to require more analysis

Spectre17 commented 6 days ago

I've finished the code for the "X (Y)" construct when fullname and asciiFullname do not match, but it triggers a test case. Sent email to Robert with details. I can submit the pull request for that, but the analysis will be needed to determine whether to accept the change I made (and likely the removal of the test case it now triggers).