Closed melias122 closed 3 years ago
[ You skip some parts of the template that the template asks you not to skip. You are missing a minimal reproduction recipe. I need to be able to reproduce the error. Please, next time, don't skip the parts that say DO NOT SKIP. Alternatively, open a discussion]
Notheless I think this might be related to https://github.com/joaotavora/eglot/issues/645 or https://github.com/joaotavora/eglot/issues/633. Is this project publicly available somewhere?
@melias122 do you have a tsconfig.json/jsconfig.json file? can you post it?
@joaotavora, sorry for that. Repository is private, but I will try to make reproducible repo. I will also try v1.7
if it happens there.
@danielpza
jsonconfig.json
{
"include": [
"./src/**/*"
]
}
@melias122 I wish you could also try the scratch/fix-697-attempt
branch. Maybe try that before working on the reproducible repo.
I'd be very thankful if you can test this, since the issuers of the other issues didn't follow through.
Just tried scratch/fix-697-attempt
and it is working fine with no freezes. I would be nice if there is a way to specify directories to ignore. Or maybe just follow .gitignore
.
Or maybe just follow .gitignore.
That's an idea...
Or maybe just follow .gitignore.
lsp-mode
has a customizable lsp-file-watch-ignored
list variable for this. That works well, imo.
Yes, lsp has knobs for a lot of things, but asking the user to customize yet another knob in what should be an out-of-box experience is sub-optimal. So if following .gitignore works for the majority of cases, I welcome that simplicity.
Simplicity is why I love eglot. As far as logic for specific version control systems, wonder if there is some inspiration (or code) to be found in Projectile VCS support.
Yes, Dmitry Gutov has suggested using project.el for this, so this is probably what I'll try soon.
On Wed, May 26, 2021, 16:12 Jarred Ward @.***> wrote:
Simplicity is why I love eglot. As far as logic for specific version control systems, wonder if there is some inspiration (or code) to be found in Projectile VCS support https://docs.projectile.mx/projectile/2.3/projects.html#version-control-systems .
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/joaotavora/eglot/issues/697#issuecomment-848853708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6PQYB75YYKOQRFUO5Q6LTPUFVRANCNFSM45SBZGLA .
Pushed another commit making use of project.el for this. It's already a dependency of eglot.el
anyway and is the rightful module to ask about ignored directories.
Tried, now emacs pauses for ~2 seconds (which is not bad, but still there is pause), but on 835aa0e, there was no pause. My root .gitignore has:
**/node_modules
With empty .gitignore it is also ~2 seconds. Any clue why?
Try evaluating these:
(benchmark 1 '(project-files (project-current)))
and
(benchmark 1 '(delete-dups (mapcar #'file-name-directory (project-files (project-current)))))
...and tell us the numbers.
Hmm, I used @dgutov 's suggestion which is faster in some cases but probably slower in others. For example in a full repo of the snowpack
repository, it's faster by about 3x, but I guess in your example it's not worth it.
As I wrote:
The directory-finding logic is probably a bit slower than using
eglot--directories-recursively, but since it honours `.gitignores` and
ignores more directories it's much faster overall. And guaranteed to
create less watchers.
I guess, I'll have to some a bit more work to try and get the best of both worlds.
@dgutov my theory if you have a directory with many many files, calling file-name-directory
for each one is going to allocate a lot of strings. If project-files
were to also return directories in the first place, both that allocation and the NlogN (at best) delete-dups
would be avoided.
It is slower than find
, but I never compared against a Lisp-based directory enumerator. It's possible it is slower, of course, but even in gecko-dev
(a huge project) the latter form prints "1.936423s" on my machine.
OTOH, it's possible that the list of directories returned by the new approach is longer, and you end up spending more time setting up watches.
OTOH, it's possible that the list of directories returned by the new approach is longer, and you end up spending more time setting up watches.
It's likely shorter, since it ignores more stuff. In the snowpack (https://www.snowpack.dev/) repo your approach was faster by about 3x than eglot--directories-recursively
(which I have now deleted, but will bring back soon). That's because eglot--directories-recursively
returns twice as many directories (and the excess directories were quite useless).
calling file-name-directory for each one is going to allocate a lot of strings
It's definitely not the most optimal approach, but it shouldn't be too terrible for a one-time action on most small-to-moderate sized projects.
If project-files were to also return directories in the first place
We can add a new defgeneric, sure. Anyone is welcome to suggest an optimal implementation of it for a Git backend.
Following up to that, if @melias122 's .gitignore
is only **/node_modules
and nothing else, both approaches will return the same dirs, but the "make a big string with full dir name for each file then remove duplicates" approach is likely slower.
We can add a new defgeneric, sure. Anyone is welcome to suggest an optimal implementation of it for a Git backend.
I'm not sure about the need for defgeneric
or what that means.
I just think that if eglot--directories-recursively
could consult project.el
's knowledge of what constitute "ignored" directories we would be golden. Maybe that is what project-ignores
does?
If I manage that, I guess, eglot--directories-recursively
could become project-directories
.
@dgutov
project-ignores is a compiled Lisp function in `project.el'.
(project-ignores PROJECT DIR)
Return the list of glob patterns to ignore inside DIR.
Patterns can match both regular files and directories.
But how to match them? What datatype exactly is "glob" pattern, is it a regexp?
Oh I see they are actual glob strings, presumably of the find
format. And I see project-files
uses the find
util and passes the globs down... Hmmm I wonder if this isn't going to be slow on windows/mingw...
My current thinking is to use eglot--directories-recursively
, and compile the globs using Eglot's glob matcher (which is supposed to be fast) to do the filtering.
In the meantime, I'm reverting the last commit.
but the "make a big string with full dir name for each file then remove duplicates" approach is likely slower
Don't forget about the extra overhead of several Lisp calls for every directory you are traversing. So I wouldn't be sure.
I'm not sure about the need for defgeneric or what that means.
A new generic function acting on project instances.
Oh I see they are actual glob strings, presumably of the find format.
Or of .gitignore
format (with slight adjustment, as documented).
Don't forget about the extra overhead of several Lisp calls for every directory you are traversing. So I wouldn't be sure
I'm reasonably sure that's the case. The number of directories is dramatically smaller than the number of file. So if all other things equal, traversing directories is much faster.
The snowpack repo example was especially benevolent to your approach, likely because of the large .gitignores
file.
However my ~/tmp
directory more closely matches the case being described by @melias122
(defun eglot--directories-recursively (&optional dir)
"Because `directory-files-recursively' isn't complete in 26.3."
(cons (setq dir (expand-file-name (or dir default-directory)))
(cl-loop with default-directory = dir
with completion-regexp-list = '("^[^.]")
for f in (file-name-all-completions "" dir)
if (file-directory-p f)
append (eglot--directories-recursively f))))
(defun eglot--directories-recursively-project (&optional dir)
(delete-dups (mapcar #'file-name-directory
(project-files
(or (project-current nil dir)
(cons 'transient dir))))))
(cl-set-difference
(eglot--directories-recursively-project "~/tmp/")
(eglot--directories-recursively "~/tmp/")
:test 'string=) ; nil
(length (eglot--directories-recursively "~/tmp/")); 678
(length (eglot--directories-recursively-project "~/tmp/")) ; 624 not sure why less...
(benchmark 10 '(eglot--directories-recursively-project "~/tmp/")) ; "Elapsed time: 3.594734s (0.419492s in 10 GCs)"
(benchmark 10 '(eglot--directories-recursively "~/tmp/")); "Elapsed time: 1.836761s (0.382498s in 9 GCs)"
Another way to go about this might be to simply go into project.el
and make a version of project-files
(called maybe project-directories
) that puts type d
instead of type f
in the find
call. If I hack that in, project-files
starts returning the directories I want:
(length (project-files (cons 'transient "~/tmp"))); 678
(benchmark 10 '(project-files (cons 'transient "~/tmp"))); "Elapsed time: 2.580373s"
So even find -type d
seems to be slower, though not as bad as current project-files
. For my heterogenous ~/tmp
dir, of course:
❯ find . -type f | wc -l
9932
❯ find . -type d | wc -l
678
Final thought on this:
find f
. Guess find
's C globs are much faster.snowpack
, seem to favour the find
/project-files
approach slightly.project-directories
addition to project.el
is desired. Should be faster than the current project-files
approach. Should actually be easy to implement.The more ignore entries there are, the slower the find
implementation becomes (Git does some smart file tracking, or pattern precompilation, IDK, that makes this way faster). Likewise, a hand-written tree traversal written in Lisp will become slower with more ignore rules added, because it's fairly difficult to write something other than a loop with comparisons for each directory name against each glob to ignore.
However, since this particular performance problem happens only in a subset of LSP situations, I think I'm going to revert the last commit. No point in having this subpar version for the only person who still is bothered by the performance problem.
No objections from me, but that's why I asked @melias122 to run a couple of basic benchmarks. There might be something else going on.
A project-directories addition to project.el is desired. Should be faster than the current project-files approach. Should actually be easy to implement.
A find
-based one? Like we have just demonstrated, I think, it can be faster in some projects, while slower in others.
Also, when you were testing in ~/tmp
, it wasn't a Git-backed directory, so you weren't taking advantage of git ls-files
's performance boost.
because it's fairly difficult to write something other than a loop with comparisons for each directory name against each glob to ignore.
Not only difficult, but it would seem downright impossible. More ignore rules means less things to traverse. Anyway, find
-based on Elisp-based should suffer proportionally here.
The point I hope to have demonstrated with the ~/tmp
benchmark is that allocating a bunch of useless strings for a 600 dir, 10000 file case can add a lot of time and that it's fairly easy to get that time back.
Also, when you were testing in ~/tmp, it wasn't a Git-backed directory, so you weren't taking advantage of git ls-files's performance boost.
Ah that, may be it indeed.
A find-based one? Like we have just demonstrated, I think, it can be faster in some projects, while slower in others.
Yes, but if you combine it with the git ls-files
thing it should be the fastest. That's why I showed the "hacked project-files
with type d
" case. Let's wait for @melias122 numbers of .
@dgutov "Elapsed time: 1.147619s"
and "Elapsed time: 1.225180s"
.
EDIT: When I run them both multiple times they are ~1.2s
.
@joaotavora attaching project file in project just find .
. Hope it helps. p.zip
Thanks, you have 70k files, which is large but not unheard of.
"Elapsed time: 1.147619s" and "Elapsed time: 1.225180s".
These don't suggest any significant impact of the string allocation (though it seems to be there, and systematically so).
Can you rewind to https://github.com/joaotavora/eglot/commit/835aa0e9b351e371be82f0661e438b13e641a5eb, where you say there was "no pause", and evaluate (benchmark-run '(eglot--directories-recursively "path-to-your-project"))
and tell us the times?
@joaotavora on https://github.com/joaotavora/eglot/commit/835aa0e9b351e371be82f0661e438b13e641a5eb it is 1.8849999999999997e-06
.
Hmm, that seems just a tad too suspiciously fast, can you paste the results of evaluating each of
(length (eglot--directories-recursively "path-to-your-project"))
(length (delete-dups (mapcar #'file-name-directory (project-files (project-current))))
Both returned: 2 (#o2, #x2, ?\C-b)
Hmmm, it seems you're not running this in the correct directory, maybe it's the instructions I gave you, which are confusing. Go to your scratch buffer and do this first:
M-x cd RET /path/to/the/root/directory/of/the/project/thats/giving/you/trouble
Now run again, but without the "path-to-your-project" as earlier:
M-: (length (eglot--directories-recursively)) RET
M-: (length (delete-dups (mapcar #'file-name-directory (project-files (project-current)))) RET
It should give you some number, like 100-300 (around the number of dirs in your project I guess).
When you confirm that it's returning plausible results, please run the two benchmarks again, also with M-:
and in the *scratch*
buffer. Just replace the length
by benchmark 1
and add a '
before the form.
@joaotavora
Yes, but if you combine it with the git ls-files thing it should be the fastest.
Not sure I understand. How to combine them? Either we use git ls-files
(and then postprocess the strings), or find -type d
.
@joaotavora
218 (#o332, #xda)
7663 (#o16757, #x1def)
git ls-files | xargs -n 1 dirname | uniq | wc -l
also gives 218
Not sure I understand. How to combine them? Either we use git ls-files (and then postprocess the strings), or find -type d.
Right, they're not combinable. I was confused. The git ls-files
output seems pre-sorted though, so if the string overhead is the is what I saw earlier, it may make sense to use this to our advantage. But clearly, something else is going on here.
@melias122
Ah, that makes much more sense. And it also means that project-files
is somehow failing at its mission.
The git ls-files output seems pre-sorted though, so if the string overhead is the is what I saw earlier, it may make sense to use this to our advantage.
Yeah, it might be amenable to some clever parsing, I suppose.
And it also means that project-files is somehow failing at its mission.
I suppose the next step is to compare the actual outputs and see what's included that shouldn't be. And why.
Compared to "plain" ls-files
, project--vc-list-files
also includes untracked files, for obvious reasons (ones not mentioned in .gitignore
). And thus, if @melias122 has a lot of such files and directories, that would explain the difference.
But I don't see why eglot--directories-recursively
would ignore them, though.
I suppose the next step is to compare the actual outputs and see what's included that shouldn't be. And why.
@melias122 can you try
(cl-subseq
(cl-set-difference (delete-dups (mapcar #'file-name-directory (project-files (project-current))))
(eglot--directories-recursively)
:test #'string=)
0 30)
And post the output here?
But I don't see why eglot--directories-recursively would ignore them, though.
It ignores directories that start with .
, an heuristic (which appears to work here).
It ignores directories that start with ., an heuristic (which appears to work here).
Inspecting p.zip
, there's a .cache
directory but inside it there aren't 7000+ directories.
Inspecting p.zip, there's a .cache directory but inside it there aren't 7000+ directories.
And it's inside node_modules
anyway, so I'm at a loss here.
@joaotavora That gave me:
("/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/babel-loader/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/eslint-loader/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/terser-webpack-plugin/content-v2/sha512/14/fb/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/terser-webpack-plugin/content-v2/sha512/5b/58/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/terser-webpack-plugin/index-v5/09/fb/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/terser-webpack-plugin/index-v5/f0/c6/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/.cache/vue-loader/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/@babel/code-frame/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/@babel/code-frame/lib/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/@babel/core/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/@babel/core/lib/config/" "/home/melias122/code/axonpro/mydoctor/web/adminapp/node_modules/@babel/core/lib/config/files/" ...)
@melias122 Any chance you don't have the corresponding entries in .gitignore
?
@dgutov I have **/node_modules
in root .gitignore
and also in adminapp/.gitignore
there is node_modules
.
I think the double star notation is not in all versions of Git. Maybe your shell is picking up one version and project.el is using another. Try just "node_modules".
João
On Thu, May 27, 2021, 13:28 Martin Eliáš @.***> wrote:
@dgutov https://github.com/dgutov I have **/node_modules in root .gitignore and also in adminapp/.gitignore there is node_modules.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/joaotavora/eglot/issues/697#issuecomment-849590265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC6PQ3N3HMFVDKWWYR5URTTPY3FFANCNFSM45SBZGLA .
I don't know, either seems sufficient to have the directory ignored in my local testing.
Even if just adminapp/.gitignore
contains node_modules
, (project-files (project-current))
does not return any files from within its contents. The only way this wouldn't work is if node_modules
is checked into the repository.
@joaotavora I tried both **/node_modules
in root and node_modules
in dirs where they are generated. None of theese works for me on master.
@dgutov why would we want to have node_modules
in the repository? :-)
Looking in project.el
it looks like only .gitignore
at the root is supported
(cl-defmethod project-ignores ((project (head vc)) dir)
...
;; FIXME: This seems to be Git-specific.
;; And / in the entry (start or even the middle) means
;; the pattern is "rooted". Or actually it is then
;; relative to its respective .gitignore (of which there
;; could be several), but we only support .gitignore at
;; the root.
why would we want to have node_modules in the repository? :-)
That would be a mistake, yes, but it's the only case I know of where Git ignores its own .gitignore
entries.
OTOH, you said that git ls-files
behaves as expected, so this is probably not it.
Looking in project.el it looks like only .gitignore at the root is supported
That only affects what project-ignores
returns, but the internal implementation of project-files
for project-vc doesn't use it.
Could you maybe M-x edebug-defun
on project--vc-list-files
? And see whether it receives some unexpected arguments.
You can also try calling git ls-files -c -o --exclude-standard
in the terminal.
Hmm, I think I found whats wrong. I noticed different results for (length (delete-dups (mapcar #'file-name-directory (project-files (project-current)))))
with and without projectile loaded 192
and 7658
.
I have this in my config (in use-pacage projectile). I am not sure why this is causing troubles:
(defun m/projectile-project-find-function (dir)
(let ((root (projectile-project-root dir)))
(and root (cons 'transient root))))
(with-eval-after-load 'project
(add-to-list 'project-find-functions 'm/projectile-project-find-function))
ah! nice. Kill that and we're good :-). Or maybe put it later in the list (what good is it anyway?).
Hey I am trying to move from lsp-mode to eglot but I have some issues one of the projects I currently work on. As title says when starting eglot in project containing node_modules (~50k files), emacs freezes for some time. When I remove node_modules emacs does not freeze. Just to note, lsp-mode does not have this issue.
LSP transcript - M-x eglot-events-buffer (mandatory unless Emacs inoperable)