jeapostrophe / racket-langserver

Other
262 stars 24 forks source link

An Experimental Refactor #103

Open 6cdh opened 1 year ago

6cdh commented 1 year ago

I'd like to introduce an experimental refactor of langserver. It aims to fix some issues and simplify development of new features.

Motivation

  1. Some issues (#101, #102) relate to type errors. The untyped code can be difficult to reason about, especially the code was written years ago, and lacks comments.
  2. Document data (e.g., language info, lexer, tokens, semantic and module information) are not clearly shared between functions. It hinders the development of new features as each new function must recreate this data for its own use.
  3. The check-syntax function has a side effect of sending diagnostic responses directly to the client. It's not ideal. If it throws an error, subsequent calls will also throw errors due to hash-ref errors.
  4. JsonRPC parsing procedures are coupled with real logic and incomplete. Isolating these procedures would improve testability and allow for external use as a library.
  5. There are diagnostic bugs when working with multiple documents. But I'm not sure how it happens.
  6. The replacement of \r\n with \n in read-message function may have potential bugs.
  7. The definitions of data structures do not same with LSP specification, and incomplete.

Design

  1. Some or all modules will be written in typed racket.

    problem: The boundary between typed and untyped code and its impact on performance need be considered.

  2. The core logic will be made available as a library for external use, facilitating the development of new features, plugins.

  3. When a document change is sent to the server, it will undergo full (or incremental) parse, analysis. The resulting data can be shared with other functions.

  4. During analysis, documents will be assigned guarantee levels based on their content:

    1. base - error occurs when running a lexer on it;
    2. lexical - no error occurs when running a lexer on it, but may contain unmatched parentheses or other issues;
    3. syntax. no error occurs when running a parser on it;
    4. semantic. no error occurs when analyzing it, that means, no undefined references, etc.
  5. LSP features will operate according to the assigned guarantee level of the document.

    For example, semantic tokens may work at the lexical level but perform better at the semantic level. Renaming a local variable may work at the syntax level while renaming a global variable may require the semantic level. Code completion may work at the lexical level, but better at the syntax level.

Dev

The project has 2702 lines of Racket code, and basic tests right now. It's possible to complete this refactor this year.

There are still some problems.

Let me know what you think.

jeapostrophe commented 1 year ago

These are decent ideas. My only caution is that I think a lot of what an LSP is supposed to provide is already provided by DrRacket and its internal analysis/etc, so I think that a very productive way to think about the LSP is as an interface to those DrRacket libraries. I think that all of your Motivations points are not in conflict with these, but the design points seem to suggest "rewrite DrRacket" to me, and I think that is a big thing to bite off.

6cdh commented 1 year ago

I want to clarify that my intention is not to reinvent the wheel and definitely don't like to do it. The design section describes an ideal design that looks like it works in that way. I'm not familiar with the Drracket interfaces at the moment, but I will try to coordinate Drracket functions and LSP features. I would consider commit changes to drracket repo if necessary. If I find it's not feasible, I will give up.