NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 27 forks source link

Email validation does not prevent dotless domains #2554

Open FiannaOBrien opened 3 days ago

FiannaOBrien commented 3 days ago

Describe the bug Metadata input form does not prevent dotless domains (e.g., "user@com") from being accepted as valid email addresses. While dotless domains are possible for emails, they are prohibited by ICANN and can cause email delivery issues with SMTP.

To Reproduce Steps to reproduce the behavior:

  1. Go to Metadata Landing page Editor
  2. Input contact email with dotless domain characters (e.g., "fakeuser@test")
  3. Fill in other required fields
  4. Submit form

Expected behavior The system should reject the email address with a dotless domain and display an error message indicating that the domain is not valid.

Additional context Dotless emails are not accepted by OSTI ELINK2.0 and will create issues for the publication and reservation of DOIs.

mbjones commented 3 days ago

Thanks, @FiannaOBrien I think we use default browser email validation on the client side (@robyngit can you confirm?), which has the limitations that you mentioned. This looks a great thing to fix and probably fairly easy in the scheme of things. Is there a developer at ESS-DIVE that would like to provide a pull request?

On the client side, this might be a very simple fix by providing an additional regular expression pattern in the view definition on the email input field. For example, add a regex pattern like pattern=".+@example\.com" to the input element on the template EMLParty.html#L23. -- of course, you'll need a regex pattern that supports the validation rule you want to use, this one just ensures the address ends in example.com. A more appropriate pattern might be something like /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/ (from https://www.geeksforgeeks.org/javascript-program-to-validate-an-email-address/), but I haven't really tested that. More details on this approach at https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/email#pattern_validation.

Also note that, because this is client-side validation, nothing prevents an adversarial client from modifying that code and submitting something else -- server side validation would be required for that but more complicated.

FiannaOBrien commented 2 days ago

@mbjones I tested out the regex you suggested on the emails we have in our OSTI record. The emails that failed to match that regex also failed OSTI's check, so I think this regex would work well for the purposes of submitting to OSTI!

This wouldn't catch mistakes in the top level domain (".giv" or ".goc" rather than ".gov") but I don't think we want to be in the business of keeping track of the ever expanding list of TLDs and country codes.