Closed oblitum closed 6 years ago
I gave this a very quick test with my YCM including changes from #2657. It works much faster than before those changes, so could you give that PR a try? The PR is implementing async semantic completion.
Hi @bstaletic. Did you give it a test on case 1, case 2 or program output of Program 2?
I tried it on all three of those, but I couldn't get Program 1 to work as I don't have Windows installed at all.
I didn't try #2657 yet, but I did check it before opening this issue. I'll try it but due to the scope of the pull, I was suspecting it wouldn't touch possibly existing issues in the request/display process as a whole.
Notice that Program 1 was compiled and tested on Linux. I didn't use Windows for anything. Also notice that some flags in it may require changes for your machine, like header locations, etc. They're based on .ycm_extra_conf.py
flags.
Actually, my bad. I have not used anything that contains <windows.h>
, which means I tested Case 1 and Program 2.
As for Program 1
, I managed to get it to work. Completions are not instantanious, but are far from being slow.
@bstaletic my bad, I've updated the issue. I've posted the wrong case 1 snippet, it was to be based on https://github.com/Valloric/YouCompleteMe/issues/777#issuecomment-236626561 but I ended up removing the using namespace
s. Unsure whether it will make any difference for you it just gets a bit slower on my side on original YCM. The windows.h
case is more problematic.
Still, I don't seealmost any slow down with the using namespace
case. I think you should try #2657 and report back. The PR won't be merged until the end of the week.
@bstaletic I've tried #2657. Overall completion is much improved, kudos to @micbou! Still it's behaving as I was expecting.
What it does:
What it doesn't do:
@oblitum Your observations seem resonable, but, on my laptop, I just can't agree with "Completion arrives, but late". The initial parse and completions too about two seconds, but to make it that slow I had to use echo 3 | tee /proc/sys/vm/drop_caches
. In any other case the completions were almost instant.
That said, I do believe you it can still feel slow for you (and others). But frankly, I have no idea where would I start looking to find the source of the slowness.
As for being resource exensive, Program 1
took ~70MiB at most. I didn't feel any CPU hogging. Those results are right after dropping caches.
@bstaletic Here a video demonstrating it: https://vimeo.com/219545420. I've updated Program 2 to 60k identifiers (doubling the original) to make the effect clearly visible. On case 2, the bare inclusion of windows.h
will generate around 43k identifiers, the effect can still be felt.
It's also worth noticing that #2657 introduces a new semantic completion behavior: after forced semantic completion, if I backspace to erase some chars of the current incomplete identifier, I'm forced to force semantic completion again to get the original semantic identifier ordering. I ignore the extension of the effects of this but some people may not like.
I'm on a 7700K PC.
Sorry about the second part of the video for the case 2, you'll be unable to know when I hit ctrl-space, I have no keyboard screencast software at the moment, I'll think of doing an update on that later.
I knew there was something wrong when I read about your CPU. I'm on an i7 3610QM, so I definitely have less raw power than you. The thing that was wrong about my results was that I was looking at completions in the code that generates the large number of completions.
Unlike you, I'm still hitting a timeout unless I type five character in the generated file.
As for case 1, I can't actually say it's too fast, but considering how big the boost's bimap
is, I'm not too surprised.
Ah OK, you were testing the program, not the output. I hope others don't fall in the same trap :)
I noticed one more thing.
It's also worth noticing that #2657 introduces a new semantic completion behavior, after forced semantic completion, if I backspace to erase some chars of the current incomplete identifier, I'm forced to force semantic completion again to get the original semantic identifier ordering. I ignore the extension of the effects of this but some people may not like.
This doesn''t happen with low number of identifiers. Also, I had to hit <C-Space>
after reinserting, not after deleting.
PR https://github.com/Valloric/ycmd/pull/774 should improve responsiveness and reduce CPU usage in Case 1 and Case 2.
PR Valloric/ycmd#774 should improve responsiveness and reduce CPU usage in Case 1 and Case 2.
Confirmed! It's much improved.
@micbou IMO, it got so much more usable now, thanks. Your changes improved a lot on the semantic results, but I'm also with others applied, #2657 and this local one:
diff --git a/cpp/ycm/IdentifierDatabase.cpp b/cpp/ycm/IdentifierDatabase.cpp
index 4a44bc7b..5e4e349a 100644
--- a/cpp/ycm/IdentifierDatabase.cpp
diff --git a/cpp/ycm/IdentifierDatabase.cpp b/cpp/ycm/IdentifierDatabase.cpp
index 4a44bc7b..5e4e349a 100644
--- a/cpp/ycm/IdentifierDatabase.cpp
+++ b/cpp/ycm/IdentifierDatabase.cpp
@@ -111,7 +111,10 @@ void IdentifierDatabase::ResultsForQueryAndType(
}
}
- std::sort( results.begin(), results.end() );
+ if (results.size() < 50)
+ std::sort( results.begin(), results.end() );
+ else
+ std::partial_sort( results.begin(), results.begin() + 50, results.end() );
}
diff --git a/cpp/ycm/PythonSupport.cpp b/cpp/ycm/PythonSupport.cpp
index 3f2d3f1d..0da9334d 100644
--- a/cpp/ycm/PythonSupport.cpp
+++ b/cpp/ycm/PythonSupport.cpp
@@ -101,11 +101,16 @@ boost::python::list FilterAndSortCandidates(
}
}
- std::sort( result_and_objects.begin(), result_and_objects.end() );
+ if (result_and_objects.size() < 50)
+ std::sort( result_and_objects.begin(), result_and_objects.end() );
+ else
+ std::partial_sort( result_and_objects.begin(),
+ result_and_objects.begin() + 50,
+ result_and_objects.end() );
}
- for ( const ResultAnd< int > &result_and_object : result_and_objects ) {
- filtered_candidates.append( candidates[ result_and_object.extra_object_ ] );
+ for ( size_t i = 0; i < result_and_objects.size() && i < 50; ++i ) {
+ filtered_candidates.append( candidates[ result_and_objects[i].extra_object_ ] );
@micbou I've noticed a large improvement on the case for Program 2 output due to the previous patch, before applying Valloric/ycmd#774.
Folks, feel free to close this if it reaches a good enough state. I've been stuck with the improvements due to my previous patch but couldn't move further on my fork due to conflicting changes with #2657 regarding parameter hints. As far as I've tested up to my previous patch I was satisfied enough already, despite not leveraging the new async system.
No time for looking into Vim completion at the moment at all.
@oblitum just to let you know we've implemented partial sorting in ycmd ;)
@vheon that's great!
Completing at global scope is now almost instant in cases 1 and 2 so we can definitely close this.
Consider the following trivial c++ program:
Case 1
Automatic semantic completion just after the member access operator will work, but forced semantic completion just before it will not. This happens because YCM is unable to handle large numbers of identifiers.
Forced semantic completion before
.
makeslibclang
return global scope completion data.libclang
is very fast in doing so, YCM chokes. Two alternatives can happen:Things that always happen:
Completion at global scope is a very important use case, specially in C codebases where nearly everything lives at the global scope. For example:
Case 2
It's a very common use case to wish to have completion for the Windows API, which is all C functions. In this realm the most useful and expected thing that YCM could offer is completion of the API provided by the included header: it simply can't handle it, despite how ubiquitous
windows.h
is.Program 1 below was used to verify bare libclang timings for these usecases (with
LIBCLANG_TIMING=1
the timings can be compared with the internal libclang timings).For case 1 after
.
(#define flags linux_flags
):For case 1 before
.
(#define flags linux_flags
):For case 2 after
.
(#define flags windows_flags
):For case 2 before
.
(#define flags windows_flags
):Notice that reparsing has been added as I suspect my experience apparently conflicts with this comment:
Despite that, ~100ms is reasonably unnoticeable (but alarming compared to ~3ms), it's not libclang's fault. The
windows.h
use case can be verified from Linux if the tips in this blog post are followed.It's around ~35k identifiers that YCM starts to choke a lot.
I've experienced YCM slowness before on huge files, and I generally just use
vim -u NONE
for opening huge files to avoid that. Files that have no semantic completion at all, just identifier completion, like some.sql
file.With this information in mind, I've created Program 2 to generate output with 60k identifiers. Saving this output to
identifiers.sql
and opening it with YCM, trying to edit at the end of the file the same problems will happen without any semantic completion:aaa
is typed, YCM will simply timeout [fixed by #2657]aaaaaa
is typed, YCM may be able to complete, but it'll be slow.Video demonstration:
System information:
Related issue:
Program 1
Program 2