azdavis / millet

A language server for Standard ML.
https://azdavis.net/posts/millet
Apache License 2.0
210 stars 12 forks source link

Relax constraint that CM file must be `sources.cm` #3

Closed jez closed 2 years ago

jez commented 2 years ago

Environment

Steps to reproduce

git clone https://github.com/jez/multi-grep
cd multi-grep
~/github/millet/target/debgug/cli

Expected behavior

no error

Actual behavior

❯ target/debug/cli
Error: /Users/jez/github/millet/sources.cm: couldn't canonicalize: No such file or directory (os error 2)

Caused by:
    No such file or directory (os error 2)

There is a single, top-level CM file in this repo called multi-grep.cm, not sources.cm.

azdavis commented 2 years ago

what do you think about a config option in vscode settings.json for the root cm file? defaulting to sources.cm in the workspace root. though i’d have to look into how to pass client config to the server. fairly sure it’s possible.

workaround in the meantime might be rename the top level cm file or make a symlink from sources.cm to it.

azdavis commented 2 years ago

i guess that’d work for the language server; for the cli we could just take in a command line option which should be easier.

jez commented 2 years ago

In my opinion, having to pass a command line flag in each project is annoying. I’m using the lang-srv binary directly, not the supporting VS Code extension, so any solution involving VS Code-specific config files would be less than ideal

In the past, I have solved this by just scanning for the single CM file at or above the current file in the directory tree. Here is the PR where I implemented this into ALE, a Vim plugin that can show SML/NJ errors in the file:

https://github.com/dense-analysis/ale/pull/884/files

It implements an optional override config variable that must be set in the case where there are two or more CM files, but by default does not need to be set if there is only a single CM file.

The above solution of requiring a config variable when using more than one CM file makes sense in that context because it is not a language server, but a Vim plugin. In this case, I would suggest that you consider creating a config file for declaring options to Millet, so that options can be read by Millet per-project from the config file after start up, instead of needing one editor-specific config file per project for Millet (VS Code config, Vim config, Emacs config, etc)

azdavis commented 2 years ago

hm, i’d like to see if i can support your use case (which is what, btw?) but i have a slight desire to offload the complexity of managing config to the editor, as e.g. rust analyzer does: https://rust-analyzer.github.io/manual.html#configuration

also, if there is a config option to set the root cm file to something non standard (ie not the default sources.cm), then i feel like we could skip the logic to search the directory for a .cm file, and instead always do either sources.cm (the default) or whatever was set by the config.

jez commented 2 years ago

hm, i’d like to see if i can support your use case (which is what, btw?)

sources.cm is not always the name of the CM file in a project, and SML/NJ projects can have more than one CM file

~/Dropbox/Classes/15312 (master)
❯ find . -type file -name '*.cm' | wc -l
44

~/Dropbox/Classes/15312 (master)
❯ find . -type file -name '*.cm' | grep 'sources.cm' | wc -l
10
Click to expand complete list ``` ./hw06/sources.cm ./hw06/cmlib/cmlib.cm ./hw06/cmlib/basis.cm ./hw01/sources.cm ./hw01/sandbox.cm ./hw04/top/top.cm ./hw04/parser/parser.cm ./hw04/dyn/sources.cm ./hw04/cmlib/cmlib.cm ./hw04/cmlib/tests/testing.cm ./hw04/cmlib/tests/qchecktest.cm ./hw04/cmlib/tests/test.cm ./hw04/cmlib/basis.cm ./hw04/dpcf-abt/abt.cm ./hw04/hpcf-abt/abt.cm ./hw04/hybrid/sources.cm ./hw03/pcf/abt.cm ./hw03/top/top.cm ./hw03/var/var.cm ./hw03/parser/parser.cm ./hw03/sec3/sources.cm ./hw03/cmlib/cmlib.cm ./hw03/cmlib/tests/testing.cm ./hw03/cmlib/tests/qchecktest.cm ./hw03/cmlib/tests/test.cm ./hw03/cmlib/basis.cm ./hw03/sec2/sources.cm ./hw03/systemf/abt.cm ./hw02/top/top.cm ./hw02/labT/abt.cm ./hw02/var/var.cm ./hw02/parser/parser.cm ./hw02/sec3/sources.cm ./hw02/cmlib/cmlib.cm ./hw02/cmlib/tests/testing.cm ./hw02/cmlib/tests/qchecktest.cm ./hw02/cmlib/tests/test.cm ./hw02/cmlib/basis.cm ./hw02/sec2/sources.cm ./hw02/trans/sources.cm ./hw02/matchT/abt.cm ./hw05/sources.cm ./hw05/cmlib/cmlib.cm ./hw05/cmlib/basis.cm ```

There are also very few CM files called "sources.cm" in the SML/NJ repo (consider: the basis library is $/basis.cm not $/basis/sources.cm). There are any number of reasons for this, some of it is just naming, some of it has to do with which files you'd like to expose as a public API vs APIs for an executable, vs APIs visible to tests, etc.

e.g. rust analyzer does

It looks like every config option listed on that page is a user preference, not a project-level config. rust-analyzer does, in fact, have project level config, though built into the language via Cargo.toml files. You don't have to specify per-project build settings, like which libraries you have and what features they're using, because rust-analyzer reads them out of your editor-agnostic, project-level config after it starts up.

we could skip the logic to search the directory for a .cm file

I would also point out that I think if you want your target audience to be first-time programmers let alone first-time SML users, having as little config as possible and having millet do as much as possible without needing any sort of config (project-local or otherwise), the target audience will have more success.

Anyways, if you don't like config, ¯\_(ツ)_/¯ it's your project.

azdavis commented 2 years ago

ok, I think I'm convinced on both counts.

your argument about the distinction between user prefs and project config makes sense; i think whatever we come up with here will be more like rust-project.json (also mentioned on that rust analyzer doc page).

however that leads me to a reason for my hesitation: i think remember reading somewhere (can't remember where) that rust-project.json evolved from a format that @.matklad hacked together into a de-facto standard, and there are some things he might re-do that are harder to re-do now that it's a de-facto standard.

so at the very least i think whatever config file we come up here should have a version field, starting at 1. i'm thinking millet.toml, which will for now have the following format:

version = 1

[workspace]
root = "whatever.cm"

the second point, automatically looking for a root .cm file, also makes sense, especially given sources.cm is less of a "standard" root cm file name than i thought, as you have provided evidence for. i hesitate slightly because of worries about what to do in the case of multiple root cm files - i don't want to do something like implicitly choose the "first" one (for some measure of "first"). so i'd like to just loudly fail and demand the user disambiguate with config.

another concern is just the perf/code complexity impact of traversing all the entries in the root, but that's probably negligible.