eclipse-langium / langium

Next-gen language engineering / DSL framework
https://langium.org/
MIT License
754 stars 68 forks source link

Registered Document Paths on Windows #1725

Closed georg-schwarz closed 1 month ago

georg-schwarz commented 1 month ago

Langium version: 3.1.0, but should also concern 3.2.0 Package name: langium/utils

Steps To Reproduce

  1. Use a Windows machine
  2. Create a new custom workspace manager, extending the DefaultWorkspaceManager, overwriting the

    override async initializeWorkspace(
    folders: WorkspaceFolder[],
    cancelToken = CancellationToken.None,
    ): Promise<void> {
    const documents = await this.performStartup(folders);
    
    documents.forEach((document) => {
      console.log(`Loading document: ${document.uri.toString()}`);
    });
    // ...
  3. Register the custom workspace manager in the langium module
  4. Initialize the workspace manager with a directory
    await services.shared.workspace.WorkspaceManager.initializeWorkspace(
    workspaceFolders,
    );

In our case, this caused problems when getting documents based on a user-defined file path:

const document = services.shared.workspace.LangiumDocuments.getDocument( 
   URI.file(path.resolve(filePath)), 
 ); // resolved to undefined, even though file was in registered working directory

The current behavior

Logs sth like the following (incorrect Windows path, see last / as separator):

Loading document: C:%5CUsers%5Cjohan%5Cworkspaces%5Cjayvee%5Cexample/state-codes.jv

The expected behavior

Logs sth like the following (correct Windows path):

Loading document: C:%5CUsers%5Cjohan%5Cworkspaces%5Cjayvee%5Cexample%5Cstate-codes.jv

Further Notes

I drilled down into the Langium codebase and traced the method calls:

msujew commented 1 month ago

@georg-schwarz I believe your URIs are double encoded - they are missing the URI scheme. Running uri.toString() should not escape the / characters. I just tried your reproduction steps on Windows (I'm on Windows myself) and failed to reproduce:

grafik

georg-schwarz commented 1 month ago

How come you have forward slashes on Windows file paths?

msujew commented 1 month ago

@georg-schwarz Because URIs != file paths. We only process URIs in Langium. They are specced to use / as separators for everything, even the paths.

I'm not even sure how you produced documents with \ in their URIs in the first place.

msujew commented 1 month ago

@georg-schwarz Looking at your implementation in your CLI (see here), you're using URI.parse to parse a file path into a URI - however, you are only supposed to put valid URIs into that function. For transforming file paths into URIs, please use URI.file instead.

Closing this, since everything works as expected in Langium.

georg-schwarz commented 1 month ago

@msujew Thank you for your fast response and the support beyond! We definitely mixed sth up there (see your explanations on uris vs. file paths). I'm positive we can make it work from here.

msujew commented 1 month ago

@georg-schwarz You're welcome. I was curious how you created those double encoded URIs. I forgot that using URI.parse on Windows file paths yields "valid" URIs, since the c: gets interpreted as the scheme of the URI, while everything else just ends up as the path and gets encoded on toString(). However, that usually isn't the intentation :D

georg-schwarz commented 1 month ago

Our original implementation (before some weird work around) was something like this:

const workingDir = process.cwd();
const workspaceFolders = [{
  name: 'projectRoot',
  uri: path.resolve(workDir)
}];
await services.shared.workspace.WorkspaceManager.initializeWorkspace(
    workspaceFolders,
);

I think we also tried some variations with URI.file(workingDir), but I'll have to try it again to find out where the issue was with that approach :) I think we just need a proper way to turn the output of process.cwd() into a valid URI, and then we should be good to go :)