Open nonumeros opened 3 years ago
What is the problem, exactly? Steps, outcome, expected, versions etc.
@lenormf I first installed the release from the official repo (see the image and I also tried with the official pacman), , then I downloaded
https://github.com/mawww/kakoune/releases/tag/v2020.09.01 as I've done in the past. Calling out :spell es
or :spell es_ES
or :spell es-ES
would not highlight anything on the *scratch*
buffer on neither iteration, whether it'd be the compiled version from the tarball release, or from the official repo by @maximbaz
Only yours kakoune-git
seemed to work.
Then as I went back and forth, I realized that since yours had worked with aspell es
, something had broken along the way.. The commit that did it, I don't know. I just downloaded the most recent day-old kakoune and had no problems.
@lenormf what I said above stands. But I noticed the date of your last update with kakoune-git
and it says
Last Updated: | 2020-05-11 07:47
Well, I think the format is May?? mmm… well, the comments threw me off then.
And now I see that by comparing it against the number on kakoune-git
it matches a 60154300 Support piping data to client stdin from the log.
Anyhow. What I said stands. Yours worked. The official repo didn't.
kakoune-git
package builds the latest code from master branch, kakoune
package follows the latest Github release, so the one from September. If it's a small patch to fix aspell support in the latest release and you can identify it, I can add it to the package, otherwise let's just wait for the next release 🙂 Would you like to try to git bisect
to find the first commit where the issue was fixed?
The first one that got it fixed between v2020.01.16 and v2020.08.04 was 67662976 because e83ad2a2 broke it.
The second one that seemed to fix it was dacaad4e because anything prior to that was not working either.
e83ad2a2a4c78e6c1b753664942f2ddf744a85f3 is the first bad commit commit e83ad2a2a4c78e6c1b753664942f2ddf744a85f3 Date: Mon Mar 9 14:22:34 2020 +0300
rc spell: Re-implement message processing in Awk
Plain shell takes too long on large files.
Fixes #3399
rc/tools/spell.kak | 69 +++++++++++++++++++++++++++++++++---------------------
It could be that your implementation of Awk doesn't support C++ style comments, which 94cdd3f9e3d8394a31515ad24e9ad226a225e9ac fixes. Or some undocumented control character produced by Aspell, fixed by ff8d4d6567cc0db94d05bef25b7979ddf34dfb78.
Could you check if the above commits fix the issue for you? If either work, a simple cherry-picking on the side of the package maintainers might be all that's needed. But ideally a new release would be the best way of handling this.
@lenormf sorry give me a second. I think you're right with the ff8d4d6 as it seems to have taken care of it.
I missed this one Bisecting: 0 revisions left to test after this (roughly 0 steps) [a7931fa0f7
which was the one prior to the ff8d4d6 My mistake (I had a few sessions open.)
I'l check again but you're right. It seems to have fix it.
@mawww Could you please close this with a reply stating what should be done here?
Happens for me in v2021.11.07. Not sure if this is the same issue. If you say so, I'll file it separately.
File on which to reproduce (in the Russian language, with aspell
, language ru
):
ПривXт, тXт ошибXа
Contents of spell_regions
:
22 '1.1,1.6|DiagnosticError' '1.14,1.16|DiagnosticError' '1.20,1.25|DiagnosticError' █
Only parts of the file are highlighted, even though all words contain an error in the form of an X mark. Looks like this: "ПривXт, тXт ошибXа" Picture:
I think it's unrelated to this issue.
The coordinates in spell_regions
are supposed to be of the form:
.
,
.
In UTF-8 encoding, ПривXт
is 12 bytes long, so the region 1.1,1.6
should highlight just the first half. It looks like the "ending byte" is actually being reported as "characters", not "bytes".
On the other hand, since ПривXт
is 12 bytes long, тXт
does start at byte 14, so the "starting byte" is correctly being reported in bytes.
Looking at the code, the starting byte comes from aspell's output, but the ending byte is calculated by awk's length()
function:
And it turns out that different awk implementations implement length()
differently:
$ gawk 'BEGIN { print length("ПривXт"); }'
6
$ mawk 'BEGIN { print length("ПривXт"); }'
11
Specifically, gawk
(GNU Awk) is the only implementation that treats strings as a sequence of characters in the current encoding; every other implementation (mawk, nawk, whatever the BSDs use, original Bell Labs awk) treats strings as a sequence of bytes, regardless of encoding.
If you're using a Debian-based system, you can uninstall gawk
and install awk
and things should work properly (and much faster!). Presumably other Linux distros have something similar.
Thanks for the analysis! I'm not on a Debian-based system, but managed to solve the issue by replacing the call to awk
in the script with a call to env LANG=C awk
. It did the trick for gawk, though I'm not sure how other awk implementations would react.
I'm just wondering if there could be a new release that addresses an issue I've been having problems with. Namely, aspell support.
As of this writing and with the current head, this is not an issue. (As far as I can tell so far.)
As it is, the https://github.com/mawww/kakoune/releases/tag/v2020.09.01 release breaks aspell with other languages.
Someone might say that I'd be better off by taking it up with the distribution maintainers in question.
The problem with this is the following. The official repo from the distribution that provides kakoune, is based on every release of the project.
https://github.com/mawww/kakoune/releases/download/v${pkgver}/kakoune-${pkgver
so, I'm not so sure they will change the above line, anytime soon.For an earlier commit available, there's another package provided by the AUR community which in turn is based on an earlier version of kakoune, which I highly recommend. It didn't seem to have any issue with aspell. @lenormf is the maintainer there.
But now, in order to have a functioning kakoune with aspell, I need to have another package manager, rather than going to the official repo of the distribution, if I wanted to. There's an app for that, as the saying goes.
I also tested and retested it as many times as possible, and I can almost assure you, that aspell with other languages, is not working. I also downloaded the release, compiled it, and no matter what I tried, it seemed to fail every time.
On the other hand, I could flag the package directly with the distro as out of date, or whatever the convention for out of date is. Furthermore, the distribution specifies that, quote, If you notice a package is out-of-date (i.e., there is a newer stable release available), then please notify us using the form below. Do not report bugs via this form!, endquote
But this is ambiguous. What I might consider stable, it may not be for the rest. And viceversa.
And if there's a release, whether is unstable or not, is not applicable here either. For there's none.
The only reason I'm asking for this, is for the simple reason that a release — whatever amount of time it takes — might take months, before becoming available. Right? In the meantime, with the latest release, aspell does not work or works partially, since it breaks identifying other languages installed in the system.