pascalkuthe commented 8 months ago

We currently don't have an RFC system, this is an experiment of starting a more RFC-like process before a larger change. I am also fairly busy right now so I am also creating this issue in the hope that somebody else may pick this issue up

Problem Statement

Currently, configuration in helix is closely coupled to serde/toml. Specifically, configuration is handled by derive(Desirilaize) on a struct. This approach was easy to get started with but doesn't scale well:

Every config option requires a handwritten struct field.
- If we want to overwrite a config option in the language config we need to add that config to the language config aswell. This doesn't scale, its impractical to duplicate the entire config
- This won't work for additional configuration added by plugins
Similarly if we want to overwrite a config option per buffer that requires yet another handwritten field on Document (which already polluted by too many fields).
The current config is not well suited for mutation. The :set and :toggle commands essentially serialize the current config to json, edit them and desiralize again. Serde however (especially with our many custom deserilaizer) doesn't necessarily roundtrip correctly through json.
- This becomes especially relevant with scheme plugins/scripts where the config should be both easily (and efficiently) readable and writable
Overwrite/Inheritance/merging is inconsistent (we have an open issue about that) with regard to depth
inspecting config options (and including documentation like we do for commands) and auto-generating docs is not possible/very hard

Prior work

The solution I would like to see is something that mirrors kakoune. I quite liked how simple yet flexible the config is. Especially the concept of scopes works quite well IMO. However, our needs are slightly more complex since kakoune doesn't have language config/LSP (which I would like to stay effortless). I cameup with a way to also use the scope system for these use-cases.

Proposed Solution

completely separate our config model from serde
- we plan to eventually move to scheme and abandon the toml config.
- until then we will have to implement (a backwards compatible) compatibility layer.
All options will have a simple canonical in-memory representation that is easy to read and modify. Convenience syntax should be handled during parsing.
- example: instead of auto-pairs accepting either a map of autopairs (like { "(" = ")", "{" = "}") or a bool to indicate whether they are enabled simply add two separate options: auto-pairs.enable and auto-pairs.pairs. This allows consistent merging/inheritance/defaulting and doesn't require weird special cases in the config system.
config options are represented as a flat called OptionManager.
- Example: softwrap.enable and softwrap.max-wrap are two entirely separate config options. The only thing that links them is a naming convention.
- Grouping can be handled by scheme macros/serde compatibility layer
- This makes merging trivial (simply iterate all OptionManagers and stop at the first one that contains the required config option (some cases may also separate from value merging, this is separate, more on that later)
Each scope where you may wish to set a config option has an OptionManager with the following priority: `split (do we want that?) -> buffer -> language -> global
- This allows for very flexible :set (and similar) commands: :set buffer line-ending crlf, :set lang:rust softwrap.enable true
All of these options managers have identical capabilities, so any config option can be set anywhere (what was plaintext language config in the past is now simply in the global config)
There is a single OptionRegistry that stores metadata like documentation and validators (and ofcoruse which options even exist)
Language servers are the only configuration that does not work in this system. We could make them dict values but that feels second class to me. Instead, I propose to simply create a special kind of scope for language servers with its own OptionRegistry that would be manipulated with :set lsp:rust-analyzer command rust-analyzer nightly. LSP options would be essentially an opaque value field (so still no merging there but I still think that is correct)
To address some config options that we may still want to merge I again took inspiration from kakoune: Kakoune has :--append and --remove flags for :set. This allows merging collections (I think we should maybe call them --mege and --subtract instead and also allow specifying a depth): set --mege=2 lsp:rust-analyzer options { "check" = { "command" = "clippy" }}.
- Something similar should be made available in the declarative scheme config.

Roughly the documentation I came up would look as follows:

pub enum Value {
    List(Box<[Value]>),
    Dict(Box<IndexMap<String, Value, ahash::RandomState>>),
    Int(usize),
    Bool(bool),
    String(Box<str>),

}

impl Value{
    fn from_str(&str) -> Self { .. }
    fn to_str(&self) -> String { .. }
}

pub struct OptionInfo {
    name: String
    description: String,
    validate: fn(&Value) -> anyhow::Result<()>,
        completer: CompletionFn
}

pub struct OptionRegistry {
    options: HashMap<String, OptionInfo>,
}

impl OptionRegistry {
    fn register(&self, name: String, opt: Option) -> bool {
        if self.options.contains(&name) {
            return false
        }else{
            self.options.insert(name, opt)
        }
    }
    fn get(&self, name: &str) -> Option<&Option>
}

pub struct OptionManager {
    vals: HashMap<String, Value>,
}

impl OptionManager {
    pub fn get(&self, option: &str) -> Option<Value>{
        self.vals.get(option)
    }

    pub fn set_unchecked(&mut self, option: &str, val: Value) -> Option<Value>{
        self.vals.insert(option, val)
    }

    pub fn set(&mut self, option: &str, val: Value, registry: &OptionRegistry) -> anyhow::Result<Option<Value>>{
        let Some(opt) = self.registry.get(name) else { bail!("unknown option {option:?}") };
        opt.validate(&val)?;
        Ok(self.set_unchecked(option, val))
    }
}

Unresolved Questions

what should the syntax for :set and friends look like (from_str and to_str above)
- In the exmaple above I used json syntax. I don't think that is ideal, as configuration will be done in scheme in the future and should look somewhat similar to not be surprising
- Maybe its better to just handle this with a scheme :eval to avoid introducing a duplicate syntax and remove the commands.
- the latter would feel clunky (these commands are quite useful) so perhaps a compromise could be to simply evaluate the last argument (the value) as a scheme value?

A-Walrus commented 8 months ago

Very much in favor of an overhaul.

We might also want OptionInfo to have some way to have an (optional) list of possible values. This would allow for autocompletion in :set and a better :toggle. Obviously this would be relevant for simple Enums and Booleans.

One drawback of the proposed approach is that we are switching to a "stringly" typed API. Instead of using rust enums we would have to compare strings

pascalkuthe commented 8 months ago

Completions are a good point. I think a much simpler implementation would be to simply provide a dedicated completer function for each option. Some strings may also be Paths where you would want path completion for example.

Regarding "stringly" typed I think that is fine. Ultimately the config contains strings anyway that are just deserialized to an enum (this only applies to fieldless enums. I am trying to avoid complex configuration that involves enums with fields). So froma configuration point of view enums are really just strings with a restricted set of allowed values. The validator would check these values.

When we actually read the co fig option we can simply use Enum::parse(value).unwrap() (we could even have a convenience function for that).

A-Walrus commented 8 months ago

Another question is how to reduce verbosity. Picking a random bit of code that uses the config: https://github.com/helix-editor/helix/blob/b306b25e82a33b4704846ab4fce9b10b2ee7f67c/helix-view/src/gutter.rs#L165 This is approximately how it might look with a bare-bones implementation of the above:

let line_number = if let Value::Int(x) = editor.config().get("line_number").unwrap() { 
    x
} else {
    unreachable!()
}

Adding some convenience functions to value (as_int, as_bool ... ) would look like

let line_number = editor.config().get("line_number").unwrap().as_int(); // If as_int can panic
let line_number = editor.config().get("line_number").unwrap().as_int().unwrap(); // If as_int returns Option
// unwraps can be relapced with "?" but we become fallible were the old code wasn't.
let line_number = editor.config().get("line_number")?.as_int()?;

Another similar option is using try_into

let line_number = editor.config().get("line_number")?.try_into()? // assuming the type can be inferred.

combining these we could have get_as<T> (name open to bike-shedding).

impl OptionManager {
    pub fn get_as<T>(&self, option: &str) -> Option<T> {
        ...
    }
}

pascalkuthe commented 8 months ago

I think get_as makes sense but the name is string I would just call it get and the normal variant get_value. I think we can use TryFrom here as a trait bound (so we don't need a new trait) and unwrap the try_into call. Any type checking will/should be done by the validators (and if they aren't I consider that a bug).

One more thing that came to mind: We would not be using these methods directly, as we also need to handle inheritance. For example:

impl Document {
    fn get_option<T: FryFrom<helix_config::Option>>(&self, opt: &str) -> Option<T> {
        self.option_manager.get(opt).or_else(|| self.language.config.get(opt)).or_else(|| self.editor_config.get(opt))
    }
}

I think this is a bit verbose and still requires storing three different fields. Kakoune handles that in a more Opp/C++ style by having a pointer to the parent OptionManager inside the option manager. I think that may make sense here too since we realistically need to refcount (and synchronize) the global configs anyway (we already do that). So for example:

struct OptionManager {
    parent: Option<Arc<RwLock<OptionManager>>>,
    ...
}

gabydd commented 8 months ago

thanks for making this RFC Pascal, one question I had is how do we imagine the key map config working with this? This may just be me missing something but I am not sure how we would be able to implement the KeyTrie with this architecture. Other then that this the behavior of this looks great to me, really easily allowing language specific config and the like.

pascalkuthe commented 8 months ago

In theory you could have KeyTrie ad a special kind of value. But I never planned for this system to handle KeyMap.

In kakoune keymapping is just entirely separate (and I think in vim too) and that's exactly the right call IMO. We already have a dedicated keymap data structure this is quite flexible and could be exposed to plugins as-is.

Keymap deserialization is actually also already separate from the rest of config deserialization (and doesn't suffer from any of the problems I described with merging) and adding something like :normap in vim should be possible already for the most part.

pascalkuthe commented 8 months ago

I suppose there is some overlap with scoping. We could allow keymaps to be scoped and merged the same as option manager (or we could even make them part of the optionsmanager, just as a seperate field instead of as part of the normal option map) but I think that is a lot more nieche/less critical and can be left to future work.

TornaxO7 commented 7 months ago

May I suggest ron as the config file-type? It's also very simple and very rust-friendly.

pascalkuthe commented 7 months ago

we will write our own here that is very simple there is no need for an external library. This rfc only concerned with the internal representation, not the actual user facing syntax (which is currently toml and will become scheme in the future)

alexjago commented 7 months ago

I'm experiencing a lot of ... I don't want to call it FUD, but those words, around the TOML/Scheme switch.

I appreciate that there's internal-technical reasons why the current approach has problem, but from a user's perspective the simple, friendly, declarative config is a really strong point in favour of Helix today.

yak shaving over thousands of lines of config files is precisely why the average user has switched to helix! -- archseer, a few months ago

Perhaps I'm misunderstanding what Scheme config would look like, and it would actually be totally fine. I guess I'd like to see some sample syntax.

the-mikedavis commented 7 months ago

To reiterate: this issue is strictly concerned with the internal representation changes and discussions of the front-end of the config system ("why is it X?", "couldn't it be Y?", etc.) are off-topic.

A-Walrus commented 7 months ago

I think being able to work on the developer side with strongly typed data and not dynamic Values is very important. I propose. We take the above, but instead of having the validator return a Result<()> it will return a Result with a meaningful, typed value as the Ok. The type of this value could be simple types like usize, String, and bool or more complex types, enums like for Absolute/Relative, Vecs etc... That way the code that wants to do something according to configuration can be strongly typed.

Implemented a basic POC version of this: https://gist.github.com/A-Walrus/46ba820f59a562105c33010bb0c9ff33

Setting values from the user side looks something like:

    manager.set("mouse", Value::Bool(true)).unwrap();
    manager
        .set("shell", serde_json::from_str("[\"cmd\", \"-C\"]").unwrap())
        .unwrap();

Mapping to and from serde_json is optional and can be used for backwords compatibilty / replaced by scheme...

Getting the typed values:

let scrolloff: &isize = manager.get("scrolloff").unwrap();
let shell: &Vec<String> = manager.get("shell").unwrap();

Trying to access a value using the incorrect type will result in a panic.

How does it work?

    /// # Invariants:
    /// converter must always return value of type specified by TypeId
    pub struct OptionInfo {
        pub name: &'static str, 
        pub description: &'static str,
        pub default: Value,
        type_id: TypeId,
        type_name: &'static str,
        converter: Box<dyn Fn(Value) -> anyhow::Result<Box<dyn Any>> + Sync>,
    }

Defining some options:

Defining all of the basic editor options

```rust impl Default for OptionRegistry { fn default() -> Self { OptionRegistry::new([ OptionInfo::new_with_tryfrom::( "scrolloff", "Number of lines of padding around the edge of the screen when scrolling", Value::Int(5), ), OptionInfo::new_with_tryfrom::( "mouse", "Enable mouse mode", Value::Bool(true), ), OptionInfo::new_with_tryfrom::( "middle-click-paste", "Middle click paste support", Value::Bool(true), ), OptionInfo::new_with_tryfrom::( "scroll-lines", "Number of lines to scroll per scroll wheel step", Value::Int(3), ), OptionInfo::new_with_tryfrom::>( "shell", "Shell to use when running external commands", Value::List(vec!["sh".into(), "-c".into()]), ), OptionInfo::new::( "line-number", "Line number display: `absolute` simply shows each line's number, while \ `relative` shows the distance from the current line. When unfocused or in \ insert mode, `relative` will still show absolute line numbers", Value::String("absolute".to_owned()), |v| { let s: String = v.try_into()?; match s.as_str() { "absolute" | "abs" => Ok(LineNumber::Absolute), "relative" | "rel" => Ok(LineNumber::Relative), _ => Err(anyhow!( "line-number must be either `absolute` or `relative`" )), } }, ), OptionInfo::new_with_tryfrom::( "cursorline", "Highlight all lines with a cursor", Value::Bool(false), ), OptionInfo::new_with_tryfrom::( "cursorcolumn", "Highlight all columns with a cursor", Value::Bool(false), ), OptionInfo::new::>( "gutters", "Gutters to display: Available are `diagnostics` and `diff` and \ `line-numbers` and `spacer`, note that `diagnostics` also includes other \ features like breakpoints, 1-width padding will be inserted if gutters is \ non-empty", Value::List(vec![ "diagnostics".into(), "spacer".into(), "line-numbers".into(), "spacer".into(), "diff".into(), ]), |v| { let v: Vec = v.try_into()?; v.into_iter() .map(|s| { Ok(match s.as_str() { "diagnostics" => GutterType::Diagnostics, "line-numbers" => GutterType::LineNumbers, "diff" => GutterType::Diff, "spacer" => GutterType::Spacer, _ => anyhow::bail!( "Items in gutter must be `diagnostics`, `line-numbers`, \ `diff` or `spacer`" ), }) }) .collect() }, ), OptionInfo::new_with_tryfrom::( "auto-completion", "Enabe automatic pop up of auto-completion", Value::Bool(true), ), OptionInfo::new_with_tryfrom::( "auto-format", "Enable automatic formatting on save", Value::Bool(true), ), OptionInfo::new::( "idle-timeout", "Time in milliseconds since last keypress before idle timers trigger. Used \ for autompletion, set to 0 for instant", Value::Int(400), |v| { let millis: usize = v.try_into()?; Ok(Duration::from_millis(millis as u64)) }, ), OptionInfo::new_with_tryfrom::( "preview-completion-insert", "Whether to apply commpletion item instantly when selected", Value::Bool(true), ), OptionInfo::new_with_tryfrom::( "completion-trigger-len", "The min-length of word under cursor to trigger autocompletion", Value::Int(2), ), OptionInfo::new_with_tryfrom::( "completion-replace", "Set to `true` to make completions always replace the entire word and not \ just the part before the cursor", Value::Bool(false), ), OptionInfo::new_with_tryfrom::( "auto-info", "Whether to display info boxes", Value::Bool(true), ), OptionInfo::new_with_tryfrom::( "true-color", "Set to `true` to override automatic detection of terminal truecolor support \ in the event of a false negative", Value::Bool(false), ), OptionInfo::new_with_tryfrom::( "undercurl", "Set to `true` to override automatic detection of terminal undercurl support \ in the event of a false negative", Value::Bool(false), ), OptionInfo::new_with_tryfrom::>( "rulers", "List of column positions at which to display the rulers. Can be overridden \ by language specific `rulers` in `languages.toml` file", Value::List(Vec::new()), ), OptionInfo::new::( "bufferline", "Renders a line at the top of the editor displaying open buffers. Can be \ `always`, `never` or `multiple` (only shown if more than one buffer is in \ use)", Value::String("never".to_owned()), |v| { let s: String = v.try_into()?; match s.as_str() { "never" => Ok(BufferLine::Never), "always" => Ok(BufferLine::Always), "multiple" => Ok(BufferLine::Multiple), _ => Err(anyhow!( "bufferline must be either `never`, `always`, or `multiple`" )), } }, ), OptionInfo::new_with_tryfrom::( "color-modes", "Whether to color the mode indicator with different colors depending on the \ mode itself", Value::Bool(false), ), OptionInfo::new_with_tryfrom::( "text-width", "Maximum line length. Used for the `:reflow` command and soft-wrapping if \ `soft-wrap.wrap-at-text-width` is set", Value::Int(80), ), OptionInfo::new_with_tryfrom::>( "workspace-lsp-roots", "Directories relative to the workspace root that are treated as LSP roots. \ Should only be set in `.helix/config.toml`", Value::List(Vec::new()), ), OptionInfo::new::( "default-line-ending", "The line ending to use for new documents. Can be `native`, `lf`, `crlf`, \ `ff`, `cr` or `nel`. `native` uses the platform's native line ending (`crlf` \ on Windows, otherwise `lf`).", Value::String("native".to_owned()), |v| { let s: String = v.try_into()?; match s.as_str() { "native" => Ok(LineEnding::Native), "lf" => Ok(LineEnding::Lf), "crLf" => Ok(LineEnding::CrLf), "ff" => Ok(LineEnding::Ff), "cr" => Ok(LineEnding::Cr), "nel" => Ok(LineEnding::Nel), _ => Err(anyhow!( "default-line-ending must be `native`, `lf`, `crLf`, `ff`, `cr`, \ or `nel`" )), } }, ), OptionInfo::new_with_tryfrom::( "insert-final-newline", "Whether to automatically insert a trailing line-ending on write if missing", Value::Bool(true), ), ]) } } ```

@pascalkuthe @archseer, would love to hear your thoughts!

archseer commented 7 months ago

I appreciate that there's internal-technical reasons why the current approach has problem, but from a user's perspective the simple, friendly, declarative config is a really strong point in favour of Helix today.

I do agree that the strong typing of the current system is a big plus, because the config is resolved at load time. The proposed syntax has to do a lot more type checking at fetch time to validate the typing.

bradclawsie commented 7 months ago

Chiming in just as a user, not a developer of Helix...I agree somewhat with @alexjago that many of us are here seeking refuge from complexity elsewhere...personally I declared bankruptcy on my tree of neovim config files.

I do see the benefit of a more-capable conf system and it would be nice for Helix to get even better scheme support as a byproduct. A nice happy medium for complexity seems to be emacs - you have a single .emacs file where your complexity is centralized...much better than neovim.

pascalkuthe commented 7 months ago

@archseer @A-Walrus sorry for taking a while here I was working on a longer response here (including implementing some of it since discussing these details without concrete implementation is pretty hard). I haven't been able to finish that yet as I am very busy atm but I havent' forgotten about this and will respond once I find the time to finish what I started

pascalkuthe commented 6 months ago

I started hashing this out a bit more and ended up changing some details. I pushed what I have as a draft PR in #9318. I am pretty happy with the config system but the hard part of actually porting the editor to use it is still missing.

I moved away form using an enum and instead went the dynamic dispatch route (the enum still exists and every type must support lossesly roundtripping trough them). With some tricks I was able to make that quite efficient (reading most config fields only has a single branch/assertion as overhead).

This helps a lot with types that we want to initialize when the config is update (like Regex or globset) and generally helped a lot with smoothing the API over. I also put in a lot of effort to make the API feel static (even tough its dynamic internally). I added a convenience macro that allows defining config structs that can look quite similar to the current serde configuration. The macro turns these structs into a trait (implementation) that allows accessing the various config fields using accessors.

I already ported the language server configuration to the new system in 27b8257fc33753e406db42161b618e2aec704971. That commit shows off the convenience macro well. We don't have to use this macro everywhere (in some places it may be easier to just define a single config option) but I find that it provides a very nice API.

Another thing that it enables is a sort of config encapsulation in the codebase. We can declare config options anywhere so for example the filepicker can declare its options in its own filepicker module and the rest of the codebase doesn't need to know about it. This also helps with moving stuff out of helix_view::editor (and `helix_core::syntax)

helix-editor / helix

RFC: Revamp Config System #8853

Problem Statement

Prior work

Proposed Solution

Unresolved Questions