janet-lang / spork

Various Janet utility modules - the official "Contrib" library.
MIT License
122 stars 35 forks source link

Add fmt configuration #135

Closed greenfork closed 1 year ago

greenfork commented 1 year ago

This is a configuration that will be helpful for DSLs, there's already one used in project.janet that uses phony and friends. There's a peg (peg/compile ~(+ "with-" "def" "if-" "when-") that tries to solve this problem but it is not enough in some situations. The current version is an initial draft, other improvements that may follow:

Example config file .janet-config.jdn

["phony"]

I formatted the files I touched in a separate commit, I hope it's alright, let me know if not.

The configuration option probably needs a description but I'm unsure where to put the documentation for it. It should either be:

  1. Transform the configuration file into a map, then the map key could be :user-indent-2-forms so it gives some context to the user.
  2. Embed the documentation into the help message from argparse, so there's less incentive to expand the list of possible options.

Example before/after:

(phony "server" []
       (os/shell "janet main.janet"))

(phony "server" []
  (os/shell "janet main.janet"))
sogaiu commented 1 year ago

The ability for a user to specify some forms does seem like it could be nice.

Not sure about the specific method.

If there is going to be a convention for a file, I think choosing .jdn like you did has at least one benefit over .janet. If you want to put such files in a project's test subdirectory, .jdn files interoperate with jpm test better than .janet files because IIUC jpm test will try to execute contained .janet files.

sogaiu commented 1 year ago

Re:

More intelligent discovery of the configuration file, for example when inside a project tree, the search could be traversed to every parent directory until the project.janet file is found, then it stops.

I guess the idea here might be to "merge" information from all located files?

I don't know if it's a good idea, but I suppose there could be a user-level (e.g. in some XDG-ish location) configuration file. That kind of thing might be able to handle the mentioned example:

(phony "server" []
       (os/shell "janet main.janet"))

(phony "server" []
  (os/shell "janet main.janet"))

in one place?

Though I do wonder whether that rule should only hold for project.janet files.


Re:

The current version is an initial draft, other improvements that may follow

You could use the draft feature to indicate this too :)

greenfork commented 1 year ago

Not sure about the specific method.

I assume you mean about the way to supply the list of forms? There are generally two options: pass it on a command line and using a file. Passing it on a command line is just not convenient, so a file it is. jdn is chosen because it is just data. Other linters/formatters use json and yaml, so jdn seems only natural here. It is also very convenient to have a default configuration file name convention so you can just put it into the root of your project.

Recent formatters like for Gleam, Go or Zig don't allow any file configuration, but some allow changing their behavior with comments in the source code, like // zig-fmt: off or something like that to avoid formatting for a range of lines. This is kind of a not so explored territory for me because I don't have any examples where I would want to format lisp forms differently.

So I believe that it is beneficial to account for a simple model of a no-config formatter. For lisps with macros it is a bit different though, so I think it would be wise to have a very constraint set of options. The currently proposed one is the only one I can think of at the moment and I wouldn't be surprised if we ever only add 1 or 2 more.

I guess the idea here might be to "merge" information from all located files?

I wouldn't go as far as to "merge" the configuration, just use the first one. The primary use case is to be able to run a formatter from a project subdirectory.

Merging is a separate topic, some ideas that I have:

I don't know if it's a good idea, but I suppose there could be a user-level (e.g. in some XDG-ish location) configuration file.

I would avoid a standardized global configuration file for now and just move up the directory, so in a degenerate case you can put a file into ~/.janet-format.jdn so it will be the top-most config file for you. I don't know how that would work on Windows -- probably another reason to avoid having some special rule for a global configuration, just use local configuration per-project.

sogaiu commented 1 year ago

Thanks for the explanations.

Overall I think keeping the initial implementation as simple as possible (which is how I read your response) seems like a good idea to me.


It is also very convenient to have a default configuration file name convention so you can just put it into the root of your project.

Yes, if implemented, perhaps this will be a typical way to use a configuration file.

I have some projects that have multiple "source" subdirectories, so for those, I imagine I would put a configuration file in each code-containing direct subdirectory of the project root directory that has code (but not necessarily at the root of the project):

|
-- project.janet
|
-- one-sudir
|  |
|  -- .janet-format.jdn
|  |
|  -- my-source.janet
|
-- another-subdir
|  |
|  -- .janet-format.jdn
|  |
|  -- my-other-source.janet
.
.
.

I wonder if it's worth considering project.janet as a special case and have janet-format handle it separately. janet-format could be taught about jpm-specific constructs. If jpm is updated to have additional constructs, then only janet-format needs to be updated to get the benefits and people won't have to update their copies of .janet-format.jdn. Though may be there is some angle I am missing.

Possibly this approach is already what you were thinking?


I think it would be wise to have a very constraint set of options.

Yes I think so too especially at the start.


I wouldn't go as far as to "merge" the configuration, just use the first one.

Ok. [2]


I would avoid a standardized global configuration file for now and just move up the directory, so in a degenerate case you can put a file into ~/.janet-format.jdn so it will be the top-most config file for you.

Ok about avoiding the standardized global configuration file.

I suggest not officially supporting what happens above a project directory (like it can happen to work, but make note that the feature may not stay around) -- that way it can be abandoned later if necessary and not have to have backward-compatibility-related code hanging around.


Have you thought about how things might be handled for the test subdirectory?

I think it is likely to be a place that houses files that use callables from various projects. I don't have a good idea about how code within it would get formatted appropriately. I don't know that I'd be keen to be duplicating info from other projects' .janet-format.jdn files into a project-level .janet-format.jdn.

In general, if one uses callables from other projects, I wonder how these are supposed to get formatted appropriately...

Possibly this is a point in favor of having formatting info within files (like your // zig-fmt: off example, though more likely what would be useful is something like # comment: 2 that applies to the whole file). This makes the info travel with the file. However, I don't think this kind of thing is good to aim for initially [1] because it seems to me that something will need to analyze the file that is about to be formatted.


[1] Personally, I'm the type to want this sort of thing, but I would not try to do it in early stages.

[2] To clarify, "Ok" just means "I'm fine with it" -- ultimate decision power lies elsewhere :)

sogaiu commented 1 year ago

Suppose there were an acceptable general naming convention to make statically determining whether a form should be considered "indent-2". Perhaps that could work?

The vague idea here is to use something like the (peg/compile ~(+ "with-" "def" "if-" "when-") construct, but somehow less constraining.

As to what such a convention could be, I don't have any good ideas.

Some random problematic ideas (possibly could trigger other more useful ones?) include:

Not really getting anywhere, but thought I'd give it a try :)

greenfork commented 1 year ago

I imagine I would put a configuration file in each code-containing direct subdirectory of the project root directory that has code

Sure, whatever makes sense for your case, this seems alright.

I wonder if it's worth considering project.janet as a special case and have janet-format handle it separately.

I think we can avoid special-casing project.janet here. project.janet file is strictly a part of a "jpm ecosystem" and if, for example, we have a different build system with a different project file format, that would make it a second level build system due to such exceptions.

The solution is to create a default .janet-format.jdn file as the part of a jpm new-project template.

I suggest not officially supporting what happens above a project directory (like it can happen to work, but make note that the feature may not stay around) -- that way it can be abandoned later if necessary and not have to have backward-compatibility-related code hanging around.

This is a very good idea to make it really clear, I agree.

I think it is likely to be a place that houses files that use callables from various projects.

It is a good problem. Also I should account for symbols being imported with a PEG, probably just start with a simple mod/mymacro scheme without accounting for custom prefixes like (import mod :prefix "mod--").

I think we can copy-paste configs at first. Probably later we can make it more convenient: if the project has a public API that uses special indentation rules, it can advertise these rules in a README somewhere; so later just as a convenience it can advertise these symbols in the project.janet file that will be handled by jpm as a part of a dependency resolution.

Have you thought about how things might be handled for the test subdirectory?

I think it applies to the whole project. If we strive for 100% correctness at least.

Possibly this is a point in favor of having formatting info within files (like your // zig-fmt: off example

You probably want to not only mark the next line, but mark the whole symbol if it is going to be used several times. I don't think it is more convenient than adding it to a config so it applies globally. A more convenient variant might be a per-file configuration in comments like # janet-format-2-indent: ["phony"] -- this approach is used in Emacs Lisp and in VimScript, and as a configuration for test files in Nim's testament. This has a use case for scripting because a script usually is a single file, so I guess it will be worth it to support this feature.

Suppose there were an acceptable general naming convention to make statically determining whether a form should be considered "indent-2". Perhaps that could work?

I think a part of an appeal to introduce new functions or macros into code is to make this code more natural to express. Janet is also in a special position as a Lisp because it allows using unicode and a more extended set of symbols than in other languages. Putting a constraint on naming goes counter to this initiative. Constraints in such cases seem to better work as conventions, for example boolean functions can by convention end on a question mark integer?. But conventions are good for humans, for formatting programs conventions are too lax, programs need restrictions.

greenfork commented 1 year ago

@bakpakin just a heads-up on this: there are more ideas from discussion with @sogaiu , but so far I had very little time to work on any of it, I will try to improve this work later too.