backdrop / backdrop-issues

Issue tracker for Backdrop core.
144 stars 40 forks source link

CKEditor Link modal incorrectly limits the length of URLs #6740

Open jenlampton opened 2 weeks ago

jenlampton commented 2 weeks ago

Description of the bug

I am trying to link text in the rich-text editor to a link a customer has given me and it comes back as invalid for some reason, I get the errorThe URL [blah] is not valid. in the modal.

First of all, the link is valid. But I can't tell why Backdrop think's its not, and changing it in inconsequential ways seems to make it valid?!?

Not Valid: https://asuonline.asu.edu/online-degree-programs/undergraduate/user-experience-ux/?utm_source=xyzmedia&utm_medium=ppl&utm_content=Conversion_Pagevisitors-Premium_userexperience-BS&utm_campaign=22-Nat_Acq_Hi&utm_ecd22=22&utm_term=AMAnimationCR46306fsbuux_%7Bs2sId%7D

If I change the domain name from asuonline.asu.edu to somewebsite.asu.edu, that makes it valid:

Valid: https://somewebsite.asu.edu/online-degree-programs/undergraduate/user-experience-ux/?utm_source=xyzmedia&utm_medium=ppl&utm_content=Conversion_Pagevisitors-Premium_userexperience-BS&utm_campaign=22-Nat_Acq_Hi&utm_ecd22=22&utm_term=AMAnimationCR46306fsbuux_%7Bs2sId%7D

If I change the last query parameter from AMAnimationCR46306fsbuux_%7Bs2sId%7D toAmericanManufacturiongAssnCR46306fsbuux_%7Bs2sId%7D , that also makes it valid:

Valid: https://asuonline.asu.edu/online-degree-programs/undergraduate/user-experience-ux/?utm_source=xyzmedia&utm_medium=ppl&utm_content=Conversion_Pagevisitors-Premium_userexperience-BS&utm_campaign=22-Nat_Acq_Hi&utm_ecd22=22&utm_term=AmericanManufacturiongAssnCR46306fsbuux_%7Bs2sId%7D

What the heck?

The valid_url() function is basically a giant regex:

function [valid_url](https://docs.backdropcms.org/api/backdrop/core%21includes%21common.inc/function/valid_url/1)($url, $absolute = FALSE) {
  if ($absolute) {
    return (bool) [preg_match](http://php.net/preg_match)("
      /^                                                      # Start at the beginning of the text
      (?:ftp|https?|feed):\/\/                                # Look for ftp, http, https or feed schemes
      (?:                                                     # Userinfo (optional) which is typically
        (?:(?:[\w\.\-\+!$&'\(\)*\+,;=]|%[0-9a-f]{2})+:)*      # a username or a username and password
        (?:[\w\.\-\+%!$&'\(\)*\+,;=]|%[0-9a-f]{2})+@          # combination
      )?
      (?:
        (?:[a-z0-9\-\.]|%[0-9a-f]{2})+                        # A domain name or a IPv4 address
        |(?:\[(?:[0-9a-f]{0,4}:)*(?:[0-9a-f]{0,4})\])         # or a well formed IPv6 address
      )
      (?::[0-9]+)?                                            # Server port number (optional)
      (?:[\/|\?]
        (?:[\w#!:\.\?\+=&@$'~*,;\/\(\)\[\]\-]|%[0-9a-f]{2})   # The path and query (optional)
      *)?
    $/xi", $url);
  }
  else {
    return (bool) [preg_match](http://php.net/preg_match)("/^(?:[\w#!:\.\?\+=&@$'~*,;\/\(\)\[\]\-]|%[0-9a-f]{2})+$/i", $url);
  }
}

Steps To Reproduce

To reproduce the behavior:

  1. Create a node
  2. Enter text into the body field
  3. Link some of the text using the URL above
  4. See the error that the link is invalid.

Actual behavior

Valid links are reported as invalid.

Expected behavior

Valid links should be allowed.

Additional information

Add any other information that could help, such as:

quicksketch commented 6 days ago

Not Valid: https://asuonline.asu.edu/online-degree-programs/undergraduate/user-experience-ux/?utm_source=xyzmedia&utm_medium=ppl&utm_content=Conversion_Pagevisitors-Premium_userexperience-BS&utm_campaign=22-Nat_Acq_Hi&utm_ecd22=22&utm_term=AMAnimationCR46306fsbuux_%7Bs2sId%7D

It's coming back as not valid because this URL is 266 characters, and the max length of the URL field for the WYSIWYG Link modal is 256 characters. So when you paste in the value, the URL gets truncated. This changes %7B (valid) to %7 (invalid), because URL encoding standard is a % symbol followed by a two-character code.

If I change the last query parameter from AMAnimationCR46306fsbuux_%7Bs2sId%7D to AmericanManufacturiongAssnCR46306fsbuux_%7Bs2sId%7D, that also makes it valid

I bet the URL was just getting truncated to 255 characters at a non-URL-encoded location.

I think the solution here is we just need to increase the #maxlength on that field. The #maxlength on Link module fields if you add a dedicated Link field is 2048 characters, I think we should increase this to match.

The good news is valid_url() appears to be correct and we don't need to mess with that ugly regex; it's just the value it was passed was truncated unexpectedly.

quicksketch commented 6 days ago

@jenlampton You're going to love this. You're the one that set the limit to 256 characters in issue https://github.com/backdrop/backdrop-issues/issues/2843 :rofl: Though it was increasing from 128 characters which was plainly too short. I still think increasing to 2048 is appropriate.

quicksketch commented 6 days ago

Super simple PR: https://github.com/backdrop/backdrop/pull/4900