ruby: "no surrounding delimiters found" for simple case

Silex commented 5 years ago

Hello,

SocialShareButton.configure do |config|
  config.allow_sites = %w{twitter facebook pinterest linkedin}
end

When I try to cs{( when the cursor is on "facebook" it fails.

ninrod commented 5 years ago

Thanks for the report. It is interesting. It seems to be an issue related to ruby-mode because I cannot reproduce it in text-mode.

ruby-mode must be defining %w{ ... } as a custom delimiter

I think the problem lies here. It is doing something with the definition of the single quote region. try to issue a vi' and you will see.

I cannot wrap my head around ruby-mode because it is extremelly complex with an excess of 1.5k LOC of elisp code. This function I linked alone has 220 LOC and it is recursive.

And me being a complete elisp beginner also doesn't help.

Silex commented 5 years ago

@ninrod thanks for the investigation.

Hum, does evil-surround actually defer some logic to the major modes? In my head, I pictured cs{( as some code inside evil-surround that found the previous { and next } and replaced that...

ninrod commented 5 years ago

Yes it does. You can see that it works in text-mode or other modes but not in ruby-mode.

To understand the specific mechanism that dictates this behaviour, you have to begin here

(defun evil-surround-outer-overlay (char)
  "Return outer overlay for the delimited range represented by CHAR.
This overlay includes the delimiters.
See also `evil-surround-inner-overlay'."
  (let ((outer (lookup-key evil-outer-text-objects-map (string char))))
    (when (functionp outer)
      (setq outer (funcall outer))
      (when (evil-range-p outer)
        (evil-surround-trim-whitespace-from-range outer "[[:space:]]")
        (setq outer (make-overlay (evil-range-beginning outer)
                                  (evil-range-end outer)
                                  nil nil t))))))

I understand that Evil surround depends on core evil functions to find the regions determined by the text object you are using. In turn, evil functions rely on information which can be overwritten by modes to determine the precise range of the text objects.

For instance, this is the implementation of the 'evil-inner-curly text-object:

(evil-define-text-object evil-inner-curly (count &optional beg end type)
  "Select inner curly bracket (\"brace\")."
  :extend-selection nil
  (evil-select-paren ?{ ?} beg end type count))

And this is the implementation of evil-select-paren

(defun evil-select-paren (open close beg end type count &optional inclusive)
  "Return a range (BEG END) of COUNT delimited text objects.
OPEN and CLOSE specify the opening and closing delimiter,
respectively. BEG END TYPE are the currently selected (visual)
range.  If INCLUSIVE is non-nil, OPEN and CLOSE are included in
the range; otherwise they are excluded.
The types of OPEN and CLOSE specify which kind of THING is used
for parsing with `evil-select-block'. If OPEN and CLOSE are
characters `evil-up-paren' is used. Otherwise OPEN and CLOSE
must be regular expressions and `evil-up-block' is used.
If the selection is exclusive, whitespace at the end or at the
beginning of the selection until the end-of-line or beginning-of-line
is ignored."
  ;; we need special linewise exclusive selection
  (unless inclusive (setq inclusive 'exclusive-line))
  (cond
   ((and (characterp open) (characterp close))
    (let ((thing #'(lambda (&optional cnt)
                     (evil-up-paren open close cnt)))
          (bnd (or (bounds-of-thing-at-point 'evil-string)
                   (bounds-of-thing-at-point 'evil-comment)
                   ;; If point is at the opening quote of a string,
                   ;; this must be handled as if point is within the
                   ;; string, i.e. the selection must be extended
                   ;; around the string. Otherwise
                   ;; `evil-select-block' might do the wrong thing
                   ;; because it accidentally moves point inside the
                   ;; string (for inclusive selection) when looking
                   ;; for the current surrounding block. (re #364)
                   (and (= (point) (or beg (point)))
                        (save-excursion
                          (goto-char (1+ (or beg (point))))
                          (or (bounds-of-thing-at-point 'evil-string)
                              (bounds-of-thing-at-point 'evil-comment)))))))
      (if (not bnd)
          (evil-select-block thing beg end type count inclusive)
        (or (evil-with-restriction (car bnd) (cdr bnd)
              (condition-case nil
                  (evil-select-block thing beg end type count inclusive)
                (error nil)))
            (save-excursion
              (setq beg (or beg (point))
                    end (or end (point)))
              (goto-char (car bnd))
              (let ((extbeg (min beg (car bnd)))
                    (extend (max end (cdr bnd))))
                (evil-select-block thing
                                   extbeg extend
                                   type
                                   count
                                   inclusive
                                   (or (< extbeg beg) (> extend end))
                                   t)))))))
   (t
    (evil-select-block #'(lambda (&optional cnt)
                           (evil-up-block open close cnt))
                       beg end type count inclusive))))

You can see that evil uses thing-at-point extensively. So I guess that any mode that interferes with thing-at-point can screw up text-object finding by evil.

But I'm pretty sure that in this specific case, the deference lies in evil-up-paren here:

(defun evil-up-paren (open close &optional count)
  "Move point to the end or beginning of balanced parentheses.
OPEN and CLOSE should be characters identifying the opening and
closing parenthesis, respectively. If COUNT is greater than zero
point is moved forward otherwise it is moved backwards. Whenever
an opening delimiter is found the COUNT is increased by one, if a
closing delimiter is found the COUNT is decreased by one. The
motion stops when COUNT reaches zero. The match-data reflects the
last successful match (that caused COUNT to reach zero)."
  ;; Always use the default `forward-sexp-function'. This is important
  ;; for modes that use a custom one like `python-mode'.
  ;; (addresses #364)
  (let (forward-sexp-function)
    (with-syntax-table (copy-syntax-table (syntax-table))
      (modify-syntax-entry open (format "(%c" close))
      (modify-syntax-entry close (format ")%c" open))
      (let ((rest (evil-motion-loop (dir count)
                    (let ((pnt (point)))
                      (condition-case nil
                          (cond
                           ((> dir 0)
                            (while (progn
                                     (up-list dir)
                                     (/= (char-before) close))))
                           (t
                            (while (progn
                                     (up-list dir)
                                     (/= (char-after) open)))))
                        (error (goto-char pnt)))))))
        (cond
         ((= rest count) (set-match-data nil))
         ((> count 0) (set-match-data (list (1- (point)) (point))))
         (t (set-match-data (list (point) (1+ (point))))))
        rest))))

You can see that it does funny things with the syntax-table, which is always defined in the major-mode, obviously. So if ruby-mode does funny things with it's syntax-table, evil will pick these things up.

Silex commented 5 years ago

Great explanation :+1: :heart_eyes:

This is complex indeed. I found this which seems related:

(defun ruby-syntax-propertize-percent-literal (limit)
  (goto-char (match-beginning 2))
  (let* ((op (char-after))
         (ops (char-to-string op))
         (cl (or (cdr (aref (syntax-table) op))
                 (cdr (assoc op '((?< . ?>))))))
         parse-sexp-lookup-properties)
    (save-excursion
      (condition-case nil
          (progn
            (if cl              ; Paired delimiters.
                ;; Delimiter pairs of the same kind can be nested
                ;; inside the literal, as long as they are balanced.
                ;; Create syntax table that ignores other characters.
                (with-syntax-table (make-char-table 'syntax-table nil)
                  (modify-syntax-entry op (concat "(" (char-to-string cl)))
                  (modify-syntax-entry cl (concat ")" ops))
                  (modify-syntax-entry ?\\ "\\")
                  (save-restriction
                    (narrow-to-region (point) limit)
                    (forward-list))) ; skip to the paired character
              ;; Single character delimiter.
              (re-search-forward (concat "[^\\]\\(?:\\\\\\\\\\)*"
                                         (regexp-quote ops)) limit nil))
            ;; Found the closing delimiter.
            (put-text-property (1- (point)) (point) 'syntax-table
                               (string-to-syntax "|")))
        ;; Unclosed literal, do nothing.
        ((scan-error search-failed))))))

ninrod commented 5 years ago

@Silex this snippet really smells like the culprit.

To test this you would have to toy with this function a bit and see if it stops interfering with the curly braces text object.

With my extreme limitation in elisp I'd begin with removing the modify-syntax-entry lines and see if it works.

Silex commented 5 years ago

Okay, I just tested and if you comment the (ruby-syntax-propertize-percent-literal end) near the end, then everything works as expected.

(defun ruby-syntax-propertize (start end)
  "Syntactic keywords for Ruby mode.  See `syntax-propertize-function'."
  (let (case-fold-search)
    (goto-char start)
    (remove-text-properties start end '(ruby-expansion-match-data))
    (ruby-syntax-propertize-heredoc end)
    (ruby-syntax-enclosing-percent-literal end)
    (funcall
     (syntax-propertize-rules
      ;; $' $" $` .... are variables.
      ;; ?' ?" ?` are character literals (one-char strings in 1.9+).
      ("\\([?$]\\)[#\"'`:?]"
       (1 (if (save-excursion
                (nth 3 (syntax-ppss (match-beginning 0))))
              ;; Within a string, skip.
              (ignore
               (goto-char (match-end 1)))
            (put-text-property (match-end 1) (match-end 0)
                               'syntax-table (string-to-syntax "_"))
            (string-to-syntax "'"))))
      ;; Symbols with special characters.
      ("\\(^\\|[^:]\\)\\(:\\([-+~]@?\\|[/%&|^`]\\|\\*\\*?\\|<\\(<\\|=>?\\)?\\|>[>=]?\\|===?\\|=~\\|![~=]?\\|\\[\\]=?\\)\\)"
       (3 (unless (nth 8 (syntax-ppss (match-beginning 3)))
            (goto-char (match-end 0))
            (string-to-syntax "_"))))
      ;; Part of method name when at the end of it.
      ("[!?]"
       (0 (unless (save-excursion
                    (or (nth 8 (syntax-ppss (match-beginning 0)))
                        (let (parse-sexp-lookup-properties)
                          (zerop (skip-syntax-backward "w_")))
                        (memq (preceding-char) '(?@ ?$))))
            (string-to-syntax "_"))))
      ;; Backtick method redefinition.
      ("^[ \t]*def +\\(`\\)" (1 "_"))
      ;; Ternary operator colon followed by opening paren or bracket
      ;; (semi-important for indentation).
      ("\\(:\\)\\(?:[\({]\\|\\[[^]]\\)"
       (1 (string-to-syntax ".")))
      ;; Regular expressions.  Start with matching unescaped slash.
      ("\\(?:\\=\\|[^\\]\\)\\(?:\\\\\\\\\\)*\\(/\\)"
       (1 (let ((state (save-excursion (syntax-ppss (match-beginning 1)))))
            (when (or
                   ;; Beginning of a regexp.
                   (and (null (nth 8 state))
                        (save-excursion
                          (forward-char -1)
                          (looking-back ruby-syntax-before-regexp-re
                                        (point-at-bol))))
                   ;; End of regexp.  We don't match the whole
                   ;; regexp at once because it can have
                   ;; string interpolation inside, or span
                   ;; several lines.
                   (eq ?/ (nth 3 state)))
              (string-to-syntax "\"/")))))
      ;; Expression expansions in strings.  We're handling them
      ;; here, so that the regexp rule never matches inside them.
      (ruby-expression-expansion-re
       (0 (ignore (ruby-syntax-propertize-expansion))))
      ("^=en\\(d\\)\\_>" (1 "!"))
      ("^\\(=\\)begin\\_>" (1 "!"))
      ;; Handle here documents.
      ((concat ruby-here-doc-beg-re ".*\\(\n\\)")
       (7 (when (and (not (nth 8 (save-excursion
                                   (syntax-ppss (match-beginning 0)))))
                     (ruby-verify-heredoc (match-beginning 0)))
            (put-text-property (match-beginning 7) (match-end 7)
                               'syntax-table (string-to-syntax "\""))
            (ruby-syntax-propertize-heredoc end))))
      ;; Handle percent literals: %w(), %q{}, etc.
      ((concat "\\(?:^\\|[[ \t\n<+(,=*]\\)" ruby-percent-literal-beg-re)
       (1 (unless (nth 8 (save-excursion (syntax-ppss (match-beginning 1))))
            ;; Not inside a string, a comment, or a percent literal.
            (ruby-syntax-propertize-percent-literal end)
            (string-to-syntax "|")))))
     (point) end)))

I have no clue what you break by commenting that part out, as the highlighting looks alright. Probably that you break some ruby motion?

ninrod commented 5 years ago

ok, you are onto something. Parallel to this you could try enhanced ruby-mode. Maybe there you don't have this problem in the first place.

Can you post a link to that snippet? I'm having problems finding it.

Silex commented 5 years ago

The link is https://github.com/emacs-mirror/emacs/blob/master/lisp/progmodes/ruby-mode.el#L1856

It does not happen with enhanced ruby mode, provided that I never activate ruby-mode before.

ninrod commented 5 years ago

@Silex, I believe the problem lies in this snippet, which defines the ruby-syntax-propertize-percent-literal function:

(defun ruby-syntax-propertize-percent-literal (limit)
  (goto-char (match-beginning 2))
  (let* ((op (char-after))
         (ops (char-to-string op))
         (cl (or (cdr (aref (syntax-table) op))
                 (cdr (assoc op '((?< . ?>))))))
         parse-sexp-lookup-properties)
    (save-excursion
      (condition-case nil
          (progn
            (if cl              ; Paired delimiters.
                ;; Delimiter pairs of the same kind can be nested
                ;; inside the literal, as long as they are balanced.
                ;; Create syntax table that ignores other characters.
                (with-syntax-table (make-char-table 'syntax-table nil)
                  (modify-syntax-entry op (concat "(" (char-to-string cl)))
                  (modify-syntax-entry cl (concat ")" ops))
                  (modify-syntax-entry ?\\ "\\")
                  (save-restriction
                    (narrow-to-region (point) limit)
                    (forward-list))) ; skip to the paired character
              ;; Single character delimiter.
              (re-search-forward (concat "[^\\]\\(?:\\\\\\\\\\)*"
                                         (regexp-quote ops)) limit nil))
            ;; Found the closing delimiter.
            (put-text-property (1- (point)) (point) 'syntax-table
                               (string-to-syntax "|")))
        ;; Unclosed literal, do nothing.
        ((scan-error search-failed))))))

there are many, many modifications in the syntax table here.

Silex commented 5 years ago

@ninrod: sure, the thing is we don't even know what the "problem" is or what feature this snippet implement.

So far all I know is that if I don't call ruby-syntax-propertize-percent-literal the bug is gone, and probably that this code is there for a reason but I don't know what. The code just looks the same without the call.

ninrod commented 5 years ago

what if you toy with the function definition itself?

Silex commented 5 years ago

what if you toy with the function definition itself?

Why? What am I trying to do?

I know this function is supposed to propertize the percent literal for ruby's syntax, but I don't know why.

Is it to allow other ruby-foo function to properly behave when it encounters percent-literal? Is it to do special code highlighting? Is it to check for errors?

We first need to understand what this function is supposed to do and why and then we can start looking for bugs.

EDIT: I asked on the mailing list :wink:

ninrod commented 5 years ago

I think that what we are trying to do is to discover the exact line that causes the incompatibility with evil. Then we can figure out why those lines are important and if there is another way to achieve the same effect but in an evil compatible way/

I think that as a last try you could try to comment out the modify-syntax-table commands inside this function. Just a hunch.

Silex commented 5 years ago

Okay, here's Stefan's amazing answer:

We are trying to solve a bug in evil-surround at https://github.com/emacs-evil/evil-surround/issues/154. The bug is that percent literals refuse to be replaced when you try to replace a pair of "{}" with "()" using evil-surround, because ruby.el modifies the percent literal syntax table in a special way (when we use `fundamental-mode' then things works as expected).

ruby-mode uses syntax-propertize to make it so that

config.allow_sites = %w{twitter facebook pinterest linkedin}

is treated somewhat like

config.allow_sites = "w{twitter facebook pinterest linkedin"

So if you go to the end of the line and try to skip back over the previous sexp you'll end up in front of the % rather than in front of the {.

The purpose is probably to handle things like:

config.allow_sites = %w{twitter facebook # pinterest linkedin}

where the # would otherwise be taken as the beginning of a comment.

If I comment out the call to `ruby-syntax-propertize-percent-literal', then the bug in evil-surround is gone, and everything just looks the same visually (the percent literals are correctly highlighted).

Not sure what "comment out the call" means exactly, but from my reading of the code, it would result in mayhem (the %w{ above would probably be highlighted as if he extended to the end of the buffer).

In order to interact better with things like Evil's delimiter-replacement, ruby-mode.el could be changed so that instead of marking the opening % and the closing } with the | syntax, it would mark the % and w part with ' syntax and then arrange to mark the inside of the delimited text with an ad-hoc syntax table that doesn't treat # as a comment started nor " as a string delimiter (i.e., the same syntax-table used by the with-syntax-table in ruby-syntax-propertize-percent-literal). Or alternatively, it could look for things like # and " and give them a . syntax.

ninrod commented 5 years ago

Great work. Well, seems like, a change in ruby-mode is to be employed in order to fix this

Silex commented 5 years ago

Great work. Well, seems like, a change in ruby-mode is to be employed in order to fix this

Yes, that or some trickery in evil-surround to handle ruby-mode specially. Maybe it could use fundamental-mode rules for ruby-mode, but that probably is tricky to achieve.

For example I wonder why evil-up-paren uses the syntax table at all, it could have been implemented "the dumb way" with a simple find next/previous delimiter and some count-logic to handle subgroups. I guess it's much more efficient the way it is done currently because you directly ask the syntax table to jump to the matching delimiter, but still.

ninrod commented 5 years ago

Yes, I think that the main idea behind this implementation is to use the already parsed file with help from the syntax-table defined by the major-mode. When the user opens the file using the major-mode, the file is already parsed so that all the delimiters are already calculated and evil simply reuses that information instead of calculating the delimiters again.

It's a trade off in performance for compatibility because the modes can break the definitions of delimiters, as we can see in this case.

I don't think that treating bugs like this in a "per-mode" basis is the way because it would not be based on a public interface offered by the mode. Instead we'd glue our code on a specific behaviour that could change and break us.

In this case I'd directly open a issue in ruby-mode regarding this compatibility with evil/evil-surround.

Plus you already have a workaround in ehanced-ruby-mode.

I will keep this issue open, but I really think that this issue belongs in core evil because, for instance, ci{ does not work with your provided sample using ruby-mode and just "pure evil".

Silex commented 5 years ago

I will keep this issue open, but I really think that this issue belongs in core evil because, for instance, ci{ does not work with your provided sample using ruby-mode and just "pure evil".

Good catch! I'll likely transfer this issue to evil then.

Silex commented 5 years ago

Continuing in the evil issue.

emacs-evil / evil-surround

ruby: "no surrounding delimiters found" for simple case #154