When using ripgrep or ag from linux with helm-ag, and matching files that contain CR-LF (windows-style) line endings, and running helm-ag-edit on them, the last carriage return (CR) will be duplicated on commit.
Expected behavior
Carriage return is not duplicated.
Steps to reproduce
Ensure variable helm-ag-base-command will run rg or ag on Linux, not Windows
In a new directory, create a file "win.txt", use some encoding with dos-line-endings, e.g.
utf-8-dos
Insert some lines with test text: "test"
Create another file unix.txt, use encoding with unix-line-endings, insert same text
Run helm-do-ag on that directory
Search for "test"
Search results from win.txt should show a ^M at the end
Run helm-ag-edit
Maybe edit something (not required), and commit
Open win.txt: You should see a duplicated CR, displayed as a single ^M (other should be hidden
by dos-style encoding in emacs), at the end of the lines that were matched. If file is opened
literally (find-file-literally) or buffer encoding is reverted to some unix encoding, both CR
are visible
Analysis
The matches from ag and rg on linux against files with CR-LF line endings will contain a trailing CR, so helm-ag and helm-ag-edit will treat it as part of the line content. When helm-ag--edit-commit is called, the file's encoding including line ending style is properly recognized by insert-file-content - for CR-LF files this means that the code for clearing the original line content ((delete-region (line-beginning-position) (line-end-position))) will leave the CR of the original line intact. Now the potentially edited line content from the edit buffer with the trailing CR is inserted, and we end up with CR-CR-LF at the end.
The scenario emacs and rg/ag running on Windows works fine. Not sure if that's creditable to the way shell output is read, or rg/ag.
I suppose this isn't really a helm-ag bug, but a consequence of ag and rg not trying to be smart about line ending style (they should be fast after all). Not sure what to make of this, I mainly post it here for others to find. Maybe add it to readme or wiki? So far I'm content with the following workaround:
Workaround
Only applies when running rg/ag on Linux, with the Windows versions of rg/ag it would mess up line endings (CR-LF to LF).
helm-ag--last-command
("rg" "--smart-case" "--no-heading" "--color=never" "--line-number" "--max-columns=512" "--ignore=.#*" "--ignore=*.o" "--ignore=*~" "--ignore=*.bin" "--ignore=*.lbin" "--ignore=*.so" "--ignore=*.a" "--ignore=*.ln" "--ignore=*.blg" "--ignore=*.bbl" "--ignore=*.elc" "--ignore=*.lof" "--ignore=*.glo" "--ignore=*.idx" "--ignore=*.lot" "--ignore=*.fmt" "--ignore=*.tfm" "--ignore=*.class" "--ignore=*.fas" "--ignore=*.lib" "--ignore=*.mem" "--ignore=*.x86f" "--ignore=*.sparcf" "--ignore=*.dfsl" "--ignore=*.pfsl" "--ignore=*.d64fsl" "--ignore=*.p64fsl" "--ignore=*.lx64fsl" "--ignore=*.lx32fsl" "--ignore=*.dx64fsl" "--ignore=*.dx32fsl" "--ignore=*.fx64fsl" "--ignore=*.fx32fsl" "--ignore=*.sx64fsl" "--ignore=*.sx32fsl" "--ignore=*.wx64fsl" "--ignore=*.wx32fsl" "--ignore=*.fasl" "--ignore=*.ufsl" "--ignore=*.fsl" "--ignore=*.dxl" "--ignore=*.lo" "--ignore=*.la" "--ignore=*.gmo" "--ignore=*.mo" "--ignore=*.toc" "--ignore=*.aux" "--ignore=*.cp" "--ignore=*.fn" "--ignore=*.ky" "--ignore=*.pg" "--ignore=*.tp" "--ignore=*.vr" "--ignore=*.cps" "--ignore=*.fns" "--ignore=*.kys" "--ignore=*.pgs" "--ignore=*.tps" "--ignore=*.vrs" "--ignore=*.pyc" "--ignore=*.pyo" "--ignore=SCCS" "--ignore=RCS" "--ignore=CVS" "--ignore=MCVS" "--ignore=.src" "--ignore=.svn" "--ignore=.git" "--ignore=.hg" "--ignore=.bzr" "--ignore=_MTN" "--ignore=_darcs" "--ignore={arch}" "installation")
Actual behavior
When using ripgrep or ag from linux with helm-ag, and matching files that contain CR-LF (windows-style) line endings, and running helm-ag-edit on them, the last carriage return (CR) will be duplicated on commit.
Expected behavior
Carriage return is not duplicated.
Steps to reproduce
helm-ag-base-command
will run rg or ag on Linux, not Windowsutf-8-dos
helm-do-ag
on that directory^M
at the endhelm-ag-edit
^M
(other should be hidden by dos-style encoding in emacs), at the end of the lines that were matched. If file is opened literally (find-file-literally
) or buffer encoding is reverted to some unix encoding, both CR are visibleAnalysis
The matches from ag and rg on linux against files with CR-LF line endings will contain a trailing CR, so helm-ag and helm-ag-edit will treat it as part of the line content. When
helm-ag--edit-commit
is called, the file's encoding including line ending style is properly recognized byinsert-file-content
- for CR-LF files this means that the code for clearing the original line content ((delete-region (line-beginning-position) (line-end-position))
) will leave the CR of the original line intact. Now the potentially edited line content from the edit buffer with the trailing CR is inserted, and we end up with CR-CR-LF at the end.The scenario emacs and rg/ag running on Windows works fine. Not sure if that's creditable to the way shell output is read, or rg/ag.
I suppose this isn't really a helm-ag bug, but a consequence of ag and rg not trying to be smart about line ending style (they should be fast after all). Not sure what to make of this, I mainly post it here for others to find. Maybe add it to readme or wiki? So far I'm content with the following workaround:
Workaround
Only applies when running rg/ag on Linux, with the Windows versions of rg/ag it would mess up line endings (CR-LF to LF).