Open iago-lito opened 2 years ago
Is the diffing you want to do performed by an external program, or something you would do inside Vim? In the last case it might be useful to not write the text to files at all, and let the expression read the buffers directly. Not sure where the output would go, instead of a file it could be a list perhaps. This would use some other option than 'diffexpr', since Vim needs to know what to do.
slightly related to the second option: https://github.com/vim/vim/issues/4241 and https://github.com/vim/vim/issues/4191#issuecomment-478213781
@brammool Ideally it would be something I would do inside Vim. I have been falling back on external programs so far only because this is what &diffexpr
is explicitly requiring.
As you say, if the given expression is internal to Vim, then it is unsure what to do with the output. This is the reason why I think another option would be beneficial, so that Vim knows what to do with the output string.
In a nutshell, the new options would work exactly like &diffexpr
, except that:
in
and new
from disk, but from the buffers instead.out
to disk, but return a string instead.All in all, the new options would be more powerful, because the current behaviour of &diffexpr
could very well be reimplemented in terms of the new option.
@brammool Ideally it would be something I would do inside Vim. I have been falling back on external programs so far only because this is what
&diffexpr
is explicitly requiring. As you say, if the given expression is internal to Vim, then it is unsure what to do with the output. This is the reason why I think another option would be beneficial, so that Vim knows what to do with the output string.In a nutshell, the new options would work exactly like
&diffexpr
, except that:
- it would not read
in
andnew
from disk, but from the buffers instead.- it would not write
out
to disk, but return a string instead.- it would not execute an external command, but vimscript instead.
All in all, the new options would be more powerful, because the current behaviour of
&diffexpr
could very well be reimplemented in terms of the new option.
We can also avoid generating the diff output and parsing it back. For every diff hunk we need:
Thus it would be a list of lists. When deleting line 4 and inserting a line below 8 you would have: [[4, 1, 4, 0], [8, 0, 7, 1]]
On the Vim side this isn't too difficult. Main work is to check the diff hunks are valid.
-- (letter from Mark to Mike, about the film's probable certificate) I would like to get back to the Censor and agree to lose the shits, take the odd Jesus Christ out and lose Oh fuck off, but to retain 'fart in your general direction', 'castanets of your testicles' and 'oral sex' and ask him for an 'A' rating on that basis. "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD
/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \\ /// \\ \\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///
When deleting line 4 and inserting a line below 8 you would have: [[4, 1, 4, 0], [8, 0, 7, 1]]
Oh, I understand that this is the internal representation of the diff actually used within vim when displaying the diffed buffers, right? Is this actually produced by internal calls to xdiff
functions?
What about "changed" lines? Where can I find documentation about this format?
Are you suggesting that this structure be:
xdiff(string_a, string_b, &diffopts)
?. (As a reminder: my intent is to split the diffed buffers into contiguous pieces and diff them piece by piece instead of the whole buffers at once. So I would eventually need to produce a list of list of chunks and merge it into one consistent list of chunks by myself.)In any case, the user would still be missing information about the buffers currently being diffed together (up to 8). So like the v:fname_{in,new,out}
variables available to &diffexpr
, the following variables would be available to the second option:
v:bufnr_original
v:bufnr_new
.. or v:bufnrs
containing up to 8 [bufnr_1, bufnr_2, bufnr_3, ..]
, depending on how vim works with 3+way diffs.
When deleting line 4 and inserting a line below 8 you would have: [[4, 1, 4, 0], [8, 0, 7, 1]]
Oh, I understand that this is the internal representation of the diff actually used within vim when displaying the diffed buffers, right? Is this actually produced by internal calls to
xdiff
functions?What about "changed" lines? Where can I find documentation about this format?
It's similar to "ed" diffs. If one line is changed both the old and new line count are the same: [4, 1, 4, 1]
Are you suggesting that this structure be:
- an input to the user-defined expression? In this case, I would be hooked too far down the process and I could not control how these hunks are calculated.
- or the expected output of the user-defined expression? In this case, is there a vim API I could use to calculate valid hunks from strings? Something like
xdiff(string_a, string_b, &diffopts)
?. (As a reminder: my intent is to split the diffed buffers into contiguous pieces and diff them piece by piece instead of the whole buffers at once. So I would eventually need to produce a list of list of chunks and merge it into one consistent list of chunks by myself.)
It's the output. You need to calculate it yourself, on a line-by-line basis. Changes within a line are not located.
In any case, the user would still be missing information about the buffers currently being diffed together (up to 8). So like the
v:fname_{in,new,out}
variables available to&diffexpr
, the following variables would be available to the second option:
v:bufnr_original
v:bufnr_new
.. or
v:bufnrs
containing up to 8[bufnr_1, bufnr_2, bufnr_3, ..]
, depending on how vim works with 3+way diffs.
It's probably better to have the user specify a function to call, like we have for other options. This function would be passed the list of buffer numbers, and return the list of lists with changes. We may also need to specify the tabpage, but most likely it's always the current tabpage.
-- To keep milk from turning sour: Keep it in the cow.
/// Bram Moolenaar -- @.*** -- http://www.Moolenaar.net \\ /// \\ \\ sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ /// \\ help me help AIDS victims -- http://ICCF-Holland.org ///
Okay great, it seems that we are converging towards the following (bikeshed-able) design, are we?
" Custom 'internal diff function'.
set idifffn=MyDiff
function MyDiff(buffnrs, tabpage)
" Read from given buffers.
" Calculate diffs the way we like.
" Calculate and return lists of chunks:
return [
\ [chunk, chunk, chunk, ...], " corresponding to bufnrs[0]
\ [chunk, chunk, ...], " corresponding to bufnrs[1]
\ ...
\ ]
endfunction
I suppose that option &idifffn
, when set, would take precedence over &diffexpr
if also set?
I find that there is still one weakness to the new option that makes it not easy/friendly to use. Consider &diffexpr
for example. In :h diff-diffexpr
, we can read that the following setting:
set diffexpr=MyDiff()
function MyDiff()
let opt = ""
if &diffopt =~ "icase"
let opt = opt . "-i "
endif
if &diffopt =~ "iwhite"
let opt = opt . "-b "
endif
silent execute "!diff -a --binary " . opt . v:fname_in . " " . v:fname_new .
\ " > " . v:fname_out
redraw!
endfunction
does "almost the same as 'diffexpr'
being empty". It is easy to write because we delegate all the diffing logic to the external diff
program.
As far as I am aware of, we could not write such a simple example with &idifffn
yet, because there is no way to invoke Vim's internal diffing logic (i.e. producing valid chunks from strings) via vimscript API. I think this would be something worth adding along with the new option. Something like:
diffchunks(strings [, {opts}]) *diffchunks()*
Calculate diff chunks from up to 8 files given as a list of strings.
The result is ready for use as output of |idifffn| expression.
The diff options and algorithm used are the ones
specified in |diffopt|, unless {opts} is given.
Is your feature request about something that is currently impossible or hard to do? Please describe the problem. According to
:h diff-diffexpr
, the variablesv:fname_{in,new,out}
are available within the custom diffing expression to find the files being currently processed on disk. Unfortunately, this information is not sufficient to find out which buffers actually correspond toin
ornew
, especially when diffing more than 2 buffers together. This prevents me from customizing&diffexpr
the way I want, with specific markers set into the various buffers to force alignment of specific lines.Describe the solution you'd like New variables
v:bufnr_in
andv:bufnr_new
would be available within the expression, pointing to the buffers currently being diffed.Describe alternatives you've considered Instead of shelling out and performing I/O accesses,
&diffexpr
would directly invoke the internal diff engine in a custom way, with the informationv:bufnr_in
andv:bufnr_new
still being available.Additional context My ultimate intent is being able to mark specific lines when diffing two misaligned files
A
andB
. E.g. after marking linesA:27
andB:44
, the files would be split into three parts independently diffed (i.e.A:1-26
againstB:1-43
,A:27-27
againstB:44-44
andA:28-end
againstB:45-end
) and then the results concatenated into one single effective diff. This way, I would be able to "force" alignment of certain lines the way I want.