100% CPU and never terminate for macro with infinite recursion

6cdh commented 1 year ago

When I open a file with the following code in vscode:

#lang racket

(define-syntax (f stx)
  stx)

f

It reachs 100% CPU and doesn't terminate. I've been running it for a few minutes and it's still running.

I had this problem for a while, but just found a minimal sample to reproduce it.

dannypsnl commented 1 year ago

I think this issue should move to DrRacket, macro expander is falling into an infinite loop. Not just language server will get crashed, DrRacket will not work properly in this case either (all arrows will disappear if you put cursor onto identifier). There has no easy workaround for this.

6cdh commented 1 year ago

I didn't expect the macro to run then I realized it's a identifier macro.

I've reported it to drracket https://github.com/racket/drracket/issues/622

6cdh commented 1 year ago

I believe this issue is unsolvable.

But I find drracket can recover from this issue if I fix the macro to make it terminate while racket-langserver can't.

Maybe we can terminate the procedure when a new request that change document content reachs, or set a timeout for a response.

The problem is once check-syntax fails, any subsequent request will fail in current design. So we can't set a timeout.

But if we use the first method, it will keep 100% CPU before the programmer fix the macro.

dannypsnl commented 1 year ago

I think we can have timeout for collecting, if any request is querying and cannot get data, then spawn another collecting.

6cdh commented 1 year ago

What the meaning of "collecting"? Do you mean the traversal in check-syntax?

If so, then what happens if the new collecting fails? My intuition is if the previous collecting failed, and there were no content changes afterward, then the next collecting can have a big chance to fail.

dannypsnl commented 1 year ago

Yes, collecting is traversal in check-syntax.

Yes, it will fail, but it avoid the CPU get fully hold. To solve the problem you say, I think we can add a cache to record a document is changed or not compare with last traversal, and do nothing if there has no vary.

6cdh commented 1 year ago

It makes sense. So it always do traversal once after a change.

How we define a change? For the current document, the documents that current document depends on, or all document?

The dependency documents looks like a good option. But the current document is easier.

What is the value of timeout? Maybe we can look the common implementation of lsp client to determine. I believe the client has a timeout for a request.

dannypsnl commented 1 year ago

File cache

Tracking dependencies can be a huge computation, even in my powerful company sponsored M2 Mac it's still not fast enough to be fluent.

However, the good part is normally we can only modify one file at once, and a good habit makes us save it after changed. This means we can actually quite ignore the case that people changes others files when do current one, by this assumption, we can use hash value or modified date to say a file is changed or not?

Timeout

I think we can measure big project like typed-racket to get a 99th traversal time?

6cdh commented 1 year ago

I added a monitor function in langserver, and see how much it takes to run check-syntax for this single file: https://github.com/racket/typed-racket/blob/master/typed-racket-test/unit-tests/all-tests.rkt

This is the result:

1688211171s: (check-syntax path doc-text #f)
cpu time: 38672 real time: 39211 gc time: 8549

It can't even perform check-syntax once in a reasonable amount of time! Finally, I think we can do nothing without make the traversal reuse multiple file analysis result.

dannypsnl commented 1 year ago

It seems like traversal need some code changes to be more modular?

6cdh commented 1 year ago

No. I was wrong. After some study, I found these performance keypoint:

Most of the time is spent on expand in that case. I made a script - link to github gist to try to understand the time of read-syntax, expand and traversal. Here is the result:

file: /home/lcdh/work/typed-racket/typed-racket-test/unit-tests/all-tests.rkt
read-syntax
cpu time: 2 real time: 2 gc time: 0
read-syntax 2nd
cpu time: 0 real time: 0 gc time: 0
expand
cpu time: 33495 real time: 33904 gc time: 6974
expand 2nd
cpu time: 654 real time: 663 gc time: 231
traversal
cpu time: 407 real time: 412 gc time: 26
traversal 2nd
cpu time: 319 real time: 324 gc time: 5

2nd means the piece of code runs at the second time on the same input.

We can see only the first time expand is heavy. The file looks small (121 lines), and it expands to 1323 lines, but it still takes a lot of time to expand.

For another file text-document.rkt in racket-langserver, it's 662 lines, but it expands to 19707 lines. This is the result:

file: /home/lcdh/work/racket-langserver/text-document.rkt
read-syntax
cpu time: 4 real time: 4 gc time: 0
read-syntax 2nd
cpu time: 2 real time: 2 gc time: 0
expand
cpu time: 1191 real time: 1206 gc time: 390
expand 2nd
cpu time: 567 real time: 574 gc time: 130
traversal
cpu time: 3364 real time: 3404 gc time: 325
traversal 2nd
cpu time: 2219 real time: 2249 gc time: 33

The time is relative small compared the smaller file all-tests.rkt. But the traversal time dominates in this case.

The performance problem of traversal is not from module. Its most time spent in document-variable > get-index-entry-info > xref-binding-definition-tag.

If we make get-index-entry-info always return false, and ship this customized check-syntax. The result would become:

file: /home/lcdh/work/racket-langserver/text-document.rkt
read-syntax
cpu time: 4 real time: 4 gc time: 0
read-syntax 2nd
cpu time: 2 real time: 2 gc time: 0
expand
cpu time: 1189 real time: 1206 gc time: 387
expand 2nd
cpu time: 553 real time: 559 gc time: 127
traversal
cpu time: 1075 real time: 1084 gc time: 215
traversal 2nd
cpu time: 1121 real time: 1133 gc time: 219

The document-variable only useful for syncheck:add-text-type and syncheck:add-docs-menu.

In the remaining 1000ms time of traversal, 90% time spent in current-module-name-resolver in this call: ((current-module-name-resolver) "check-syntax.rkt" #f #f #t).

If we reuse the call of current-module-name-resolver, the second traversal time would become 100+ms. This would be quite fast.

If someone has time to solve the performance problem then these information would be helpful.

6cdh commented 1 year ago

After experimenting with typed/racket, I discovered that the file with the longest expand time is likely to be this file:

typed-racket/typed-racket-lib/typed-racket/core.rkt
cpu time: 58110 real time: 58921 gc time: 15838

I think we can limit the expand time to 90 seconds?

dannypsnl commented 1 year ago

I think so.

jeapostrophe / racket-langserver

100% CPU and never terminate for macro with infinite recursion #109

File cache

Timeout