Junk characters in snippet expansion text

bpj commented 8 years ago

I have a problem with a somewhat evil snippet which looks like this:

snippet "rlink(?:\s+\#([-\w]+)?)?\s+(\S+)(?:\s+(.*))?" "save an external reflink" r
[${1:`!p snip.rv = reg_or(snip.v.text, reg_or(match.group(3), 'LINK_TEXT'))`}][`!p
if not snip.c:
    id = reg_or(match.group(1), reg_or(snip.v.text, match.group(3)), filtr=html_id)
    url = reg_or(match.group(2), '!!!URL-MISSING!!!')
    title = reg_or(match.group(3), "")
    reflinks = vim.bindeval('b:reflink_dict')
    if not reflinks.has_key(id):
        reflinks[id] = '[{0}]: {1} "{2}"'.format(id, url, title)
    snip.rv = id
`]$0
endsnippet

(NOTE: I updated the code above so that people who copy it can avoid another bug, unrelated to the issue this thread is about. Long story short: the dict became full of junk sometimes if the id depended on the link text, so now the id never depends on the link text as inserted in tab $1. You will instead get an error if the visual selection, the explicit #ID and the title text all are missing. /bpj)

The idea is that you type a trigger like rlink #id http://example.com The link title<tab> and get the trigger replaced with a snippet for a Markdown link reference: [{LINK_TEXT}][ID] at the same time as a suitable link definition is saved to a buffer local dict variable, to be later written out elsewhere in the document with the help of another snippet. The LINK_TEXT defaults to the visually selected text if any, or else to the link title, while the id defaults to the link text reformatted into a valid id. Any of the 'arguments' in the trigger can be the name of a Vim register like @+, in which case they will be expanded to the content of that register (this is very handy when you have copied an URL in the web browser! :-) The html_id() function is just some regex.sub() calls which massage the argument into a valid HTML id string. The reg_or() function lives in a module in ~/.vim/pythonx/ and looks like this:

def reg_or(text,default,filtr=None):
    """
    -   If ``text`` looks like a Vim register name in ``@x`` notation 
            replace it with the content of that register.
    -   If ``text`` or the register is empty return ``default``.
    -   If ``filtr`` is a callable filter ``text`` through it, otherwise return
            ``text`` as is.

    This is useful when ``text`` is the content of a match group, avoiding to
    litter lots of snippets with the checking/replacing/filtering code.
    """
    if text is None:
        text = default
    if reg_name_re.match(text):
        text = vim.eval(text)
    if not len(text):
        text = default
    if callable(filtr):
        return filtr(text)
    else:
        return text

This works perfectly most of the time, but sometimes (especially, perhaps only, when there was a visual selection) I get junk characters before the expanded text like v3G15|o3G2|o[the foo command][the_foo_command]. On the face of it those characters are entirely unrelated to anything in the input -- come to think of it they look like normal mode commands. I don't know if this is because of something stupid I'm doing, a bug in UltiSnips or something in between, but I thought I should bring it to your attention at least to hopefully get an explanation what's going on. And in case you wonder, this madness saves lots of time, and keystrokes, for me when converting vanilla plaintext into Markdown. :-)

seletskiy commented 8 years ago

@bpj: Thanks for the good example of using UltiSnips!

I can't reproduce error by using following all.snippets file and triggering snippet in the [No Name] buffer:

global !p
vim.command('let b:reflink_dict = {}')

reg_name_re = re.compile(r'@.')

def html_id(id):
    return id

def reg_or(text,default,filtr=None):
    """
    -   If ``text`` looks like a Vim register name in ``@x`` notation
            replace it with the content of that register.
    -   If ``text`` or the register is empty return ``default``.
    -   If ``filtr`` is a callable filter ``text`` through it, otherwise return
            ``text`` as is.

    This is useful when ``text`` is the content of a match group, avoiding to
    litter lots of snippets with the checking/replacing/filtering code.
    """
    if text is None:
        text = default
    if reg_name_re.match(text):
        text = vim.eval(text)
    if not len(text):
        text = default
    if callable(filtr):
        return filtr(text)
    else:
        return text
endglobal

snippet "rlink(?:\s+\#([-\w]+)?)?\s+(\S+)(?:\s+(.*))?" "save an external reflink" r
[${1:`!p snip.rv = reg_or(snip.v.text, reg_or(match.group(3), 'LINK_TEXT'))`}][`!p
id = reg_or(match.group(1), t[1], filtr=html_id)
url = reg_or(match.group(2), '!!!URL-MISSING!!!')
title = reg_or(match.group(3), "")
reflinks = vim.bindeval('b:reflink_dict')
if not reflinks.has_key(id):
    reflinks[id] = '[{0}]: {1} "{2}"'.format(id, url, title)
snip.rv = id
`]$0
endsnippet

I tried to do some visual selection and so on.

I definitely saw the error you've encountered some time ago, but I can't recall what it is linked with. Can you try to find exact steps to reproduce it?

BTW, rlink is bit extra typing, I guess, cuz you can modify your regexp so it will accurately match #id followed by URL without need of rlink prefix, so snippet will be even more effective.

bpj commented 8 years ago

@seletskiy: Thanks for looking into this so quickly!

@bpj: Thanks for the good example of using UltiSnips!

Thanks. IMO it's at risk of going over the top, but as long as it works...

I can't reproduce error by using following all.snippets file and triggering snippet in the [No Name] buffer:

global !p vim.command('let b:reflink_dict = {}')

In my experience this doesn't work as it should. You have to do :let b:reflink_dict={} on the Vim command line first. I tried to write some code which first checked if the variable existed and wasn't a dict, but some vim-pyth error handling kicked in before my code got a chance to run.

reg_name_re = re.compile(r'@.')

It's r'^@\S$' actually for good measure.

snippet "rlink(?:\s+#([-\w]+)?)?\s+(\S+)(?:\s+(.*))?" "save an external reflink" r [${1:!p snip.rv = reg_or(snip.v.text, reg_or(match.group(3), 'LINK_TEXT'))}][!p id = reg_or(match.group(1), t[1], filtr=html_id) url = reg_or(match.group(2), '!!!URL-MISSING!!!') title = reg_or(match.group(3), "") reflinks = vim.bindeval('b:reflink_dict') if not reflinks.has_key(id): reflinks[id] = '[{0}]: {1} "{2}"'.format(id, url, title) snip.rv = id ]$0 endsnippet

I tried to do some visual selection and so on.

I definitely saw the error you've encountered some time ago, but I can't recall what it is linked with. Can you try to find exact steps to reproduce it?

You never think of what you are doing before such a thing happens, you know, but I'm on the lookout now! My guess after sleeping on it is that some of the Vim code which UltiSnips executes to select text is leaking out into the expansion text. Thanks for confirming that it's not just me though!

BTW, rlink is bit extra typing, I guess, cuz you can modify your regexp so it will accurately match #id followed by URL without need of rlink prefix, so snippet will be even more effective.

I guess I could make it shorter like rlk or something, but I can't remove the prefix entirely. For one thing both the id and the title text are optional[^1], and besides I got several similar snippets, notably irlink which creates an internal reference link, where the URL is #ID, and ianc for creating a custom anchor (all three using the same stash for the link definitions of course!)

[^1]: spaced out the regex is as follows:

    r"""(?x)
        rlink           # prefix
        (?:             # start of id chunk
            \s+         # space before id
            \#          # id marker
            ([-\w]+)?   # optional id text
        )?              # the whole id chunk is optional
        \s+             # space before url
        (\S+)           # url text
        (?:             # start of title chunk
            \s+         # space before title
            (.*)        # optional title text
        )?              # the whole title chunk is optional
    """

The bare # is sometimes necessary in order to make the regex parser understand what's going on when there is no custom id.

nihlaeth commented 7 years ago

I seem to have the same problem with junk characters (they even look similar). In my case it happens with an anonymous snippet expanded in a post_jump action. It only happens with the first tab stop in the anonymous snippet, and it hasn't happened yet with a second and third snippet which use anonymous snippets in a similar way. I haven't figured out yet why the junk characters happen sometimes, while the snippet functions perfectly fine other times. I have been testing with an empty file, just expanding that one snippet, and still it's inconsistent.

system info: vim 8.0, linux, used over ssh with putty

Offending snippet: the purpose is to make TODO tokens in the generated docstring into tabstops when defining a function.

post_jump "if snip.tabstop == 0: expand_docstring(snip)"
snippet def "function with docstring" bms
def ${1:function}(`!p
if snip.indent:
    snip.rv = 'self' + (", " if len(t[2]) else "")`${2:arg1}):
    `!p
snip >> 1
write_function_docstring(snip, get_args(t[2])) `
    ${5:${VISUAL:pass}}
endsnippet

Relevant functions in global:

global !p

class Arg(object):
    def __init__(self, arg):
        self.arg = arg
        self.default = None
        self.type_ = None
        if '=' in arg:
            parts = arg.split('=')
            arg = '' if len(parts) < 1 else parts[0].strip()
            self.default = '' if len(parts) < 2 else parts[1].strip()
        if ':' in arg:
            parts = arg.split(':')
            arg = '' if len(parts) < 1 else parts[0].strip()
            self.type_ = '' if len(parts) < 2 else parts[1].strip()
        self.name = arg.strip()

    def __str__(self):
        return self.name

    def __unicode__(self):
        return self.name

    def is_kwarg(self):
        return self.default is not None

    def has_type(self):
        return self.type_ is not None

def get_args(arglist):
    args = [Arg(arg) for arg in arglist.split(',') if arg]
    args = [arg for arg in args if arg.name != 'self']

    return args

def format_arg(arg, snip):
    if arg.has_type():
        snip += arg.name
    else:
        snip += "%s: TODO" % arg.name
    snip >> 1
    if arg.is_kwarg():
        snip.rv += "{} optional".format(
            ',' if not arg.has_type() else ':')
        snip += "TODO, defaults to %s" % arg.default
    else:
        snip += "TODO"
    snip << 1

def write_docstring_args(args, snip):
    kwargs = [arg for arg in args if arg.is_kwarg()]
    args = [arg for arg in args if not arg.is_kwarg()]

    if len(args) > 0:
        snip += "Parameters"
        snip += "----------"
        for arg in args:
            format_arg(arg, snip)
        snip.rv += '\n' + snip.mkline('', indent='')
    if len(kwargs) > 0:
        snip += "Keyword Arguments"
        snip += "-----------------"

        for kwarg in kwargs:
            format_arg(kwarg, snip)
        snip.rv += '\n' + snip.mkline('', indent='')

def write_function_docstring(snip, args):
    """
    Writes a function docstring in the numpy style.
    """
    snip.rv += '"""'
    snip += "TODO"
    snip.rv += '\n' + snip.mkline('', indent='')
    snip += "TODO"
    snip.rv += '\n' + snip.mkline('', indent='')

    if args:
        write_docstring_args(args, snip)

    snip += 'Raises'
    snip += '------'
    snip += 'TODO'
    snip.rv += '\n' + snip.mkline('', indent='')

    snip += 'Returns'
    snip += '-------'
    snip += 'TODO'
    snip.rv += '\n' + snip.mkline('', indent='')

    snip += 'Examples'
    snip += '--------'
    snip += '..doctest::'
    snip.rv += '\n' + snip.mkline('', indent='')
    snip >> 1
    snip += '>>> TODO'
    snip << 1

    snip += '"""'

def generate_anon_docstring(lines):
    edited_lines = []
    tabstop = 1
    for line in lines:
        if "TODO" in line:
            # add tabstop
            edited_lines.append(
                line.replace("TODO", "${%d:TODO}" % tabstop))
            tabstop += 1
            continue
        edited_lines.append(line)
    return '\n'.join(edited_lines)

def expand_docstring(snip):
    docstring = []
    doc_start = None
    doc_end = None
    for line_n in range(snip.snippet_start[0], snip.snippet_end[0] + 1):
        line = snip.buffer[line_n]
        if line.strip() == '"""':
            if doc_start is not None:
                doc_end = line_n
                break
            doc_start = line_n
            continue
        if doc_start is None:
            continue
        docstring.append(line)
    if doc_start is None or doc_end is None:
        # malformed docstring
        return
    snip.buffer[doc_start + 1:doc_end] = [""]
    snip.cursor.set(doc_start + 1, 0)
    snip.expand_anon(generate_anon_docstring(docstring))

endglobal

Result of expanding def, and tabbing to first TODO in docstring:

def function(arg1):
    """
    v3G8|o3G5|oTODO

    TODO
…(rest as expected)

For the entire file, including correctly functioning snippets (triggers class and """), see: https://github.com/nihlaeth/dotfiles/blob/f776212ad6192386706499cdcbb0214fe8cd8cac/vim/snipps/python_common.snippets

bpj commented 7 years ago

@seletskiy I have found a possible condition for this happening: when a tabstop text uses the text of an earlier tabstop and the default text of the earlier tabstop was left in place. Admittedly the later tabstop is complicated as hell -- it lives in an anonymous snippet inserted by a function called by a post-jump action and gets its value from a buffer variable set to the text of the earlier tabstop by that function -- but the junk doesn't appear if I change the text of the earlier tabstop, even if I type the very same text as the default text!

bpj commented 7 years ago

@seletskiy: same scenario as above, not using the earlier tabstop text but the junk still appears if the earlier tabstop is left unmodified. Both tabstops use the same buffer variable as default text through an embedded python block. Reversing the order of the two tabstops (the one using the post-jump code as $1 and the other one as $2 works around the problem.

SirVer commented 6 years ago

I played around with both examples and was unable to trigger this bug. Can we have a more minimal repro case?

bpj commented 6 years ago

I don't even have a clear idea what triggers it. It seems a bit random.

Den sön 1 apr 2018 22:45Holger Rapp notifications@github.com skrev:

I played around with both examples and was unable to trigger this bug. Can we have a more minimal repro case?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SirVer/ultisnips/issues/751#issuecomment-377815412, or mute the thread https://github.com/notifications/unsubscribe-auth/ABG3Uy45--wOyH6DDquju1G2gYGZWPNuks5tkTxCgaJpZM4J9B0u .

Maelan commented 6 years ago

So I have a short repro case which inserts the same kind of “garbage” characters. Confirming suspicions people had here, I identified these as normal Vim commands issued by UltiSnips.

Versions:

UltiSnips, git revision 6fdc3647f72e0a1f321ea6bd092ecd01f7c187ba (ie. latest version as of now)
VIM - Vi IMproved 8.1 (2018 May 17)
Python 3.6.5
Archlinux, kernel 4.17.2, x86_64

global !p
def test_snippet():
    vim.command('call input("pause")')
endglobal

pre_expand "test_snippet()"
snippet pre
[${0}]
endsnippet

pre_expand "test_snippet()"
snippet predef
[${0:default}]
endsnippet

post_jump "if snip.tabstop == 0: test_snippet()"
snippet post
[${1}][${0}]
endsnippet

post_jump "if snip.tabstop == 0: test_snippet()"
snippet postdef
[${1}][${0:default}]
endsnippet

Then, in a new file, compare the output of the following key sequences.

Typing:
```
ipre<Tab><Enter>
```
(the Enter key is needed to resume from the Vim “pause” command) gives the correct result:
```
[]
 ↑ INSERT mode
```

Typing:

ipredef<Tab><Enter>

also gives the correct result:

[default]
 ↑↑↑↑↑↑↑ SELECT mode

But typing:
```
ipost<Tab><C-j>
```
does not “pause” Vim, and inserts a garbage a in the output (here, a probably is a Vim command with which UltiSnips intended to enter INSERT mode):
```
[][a]
    ↑ INSERT mode
```
And typing:
```
ipostdef<Tab><C-j>
```
does not “pause” Vim, inserts as garbage characters the Vim commands that were supposed to enter SELECT mode and select the placeholder (v4G10|o4G4|o — v enters VISUAL mode, 4G goes to line 4, 10| goes to column 10, and so on), and, as a consequence, leaves the user in INSERT mode instead of SELECT mode:
```
[][v4G10|o4G4|odefault]
               ↑ INSERT mode
```

Also, as another bug, it can be seen that when jumping to a placeholder with a default text, UltiSnips selects the default text with hard‐coded coordinates. This means that any modification of the cursor position from Python, and even from Vim, are discarded. This means that the selection gets wrong as soon as we modify the current line from Python.

SirVer commented 4 years ago

I could repro this.

bpj commented 4 years ago

Any idea what triggers it specifically?

SirVer commented 4 years ago

No, not at this point in time. My hunch is that UltiSnips heuristic that track buffer changes gets confused by the temporary calling out to vimscript. If that is the case, it will be hard to avoid this bug.

bpj commented 4 years ago

Well it would at least be good to know what to avoid more specifically. I have tried probing it but it's hard to see what exactly is triggering it.

Den lör 2 nov. 2019 10:19Holger Rapp notifications@github.com skrev:

No, not at this point in time. My hunch is that UltiSnips heuristic that track buffer changes gets confused by the temporary calling out to vimscript. If that is the case, it will be hard to avoid this bug.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SirVer/ultisnips/issues/751?email_source=notifications&email_token=AAI3OUYLX37IJLRHLQDLIGLQRVAYPA5CNFSM4CPUDUXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC4XT4A#issuecomment-549026288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI3OUYVQGLMGYUR47E7QP3QRVAYPANCNFSM4CPUDUXA .

BertrandSim commented 1 year ago

I've tracked the problem of the junk characters down to vim_helper.select(). This function is called by snippet_manager._jump().

In particular,

_jump() is called after a snippet expansion followed by an automatic jump to the first placeholder, or a jump forward/backward.
_jump() calls vim_helper.select(), which creates a move_cmd.
The lines with move_cmd will produce the junk characters mentioned in this issue thread, such as a,v,G,|,o.
If the placeholder text in the tabstop is empty (eg. ${1}), move_cmd is either i or a. This relates to behavior in the ipost snippet mentioned above.
If the placeholder text in the tabstop is nonempty (eg. ${1:default}), move_cmd is typically v{end.line}G{end.col}|o{start.line}G{start.col}|o . This relates to the ipostdef snippet mentioned above.

SirVer commented 1 year ago

@BertrandSim I believe the problem is one of timing: Vim is executing this move command at a different timing that it does in most cases when using UltiSnips. I never was able to root cause why, but it means in certain cases these move command is interpreted as text being typed into the buffer.

SirVer / ultisnips

Junk characters in snippet expansion text #751