Open SkuldNorniern opened 12 months ago
We need to settle 2 matters before dealing with these wacky woohoo formats.
What should and should not be included as a supported language?
We've narrowed this down to "text-based code, document, data, and DSL" (albeit some ambiguity remains).
To what extent of complexity are we willing to implement detection methods?
My suggested policy for limiting the scope of detection methods is no regex, no heuristics.
More on this decision later, but for now, I did come up with a way to detect above formats without involving regex. I present to you the "trigger-determiner" method:
.ssh/config
-> { filename = "config", parent = ".ssh" }
nginx/**/*.conf
-> { extension = "conf", ancestor = "nginx" }
fs::canonicalize()
the path, or fallback to original pathCompared to regex, which is ran on every given path and tests each pattern one by one,
this method it is only triggered on certain filenames and extensions and is O(1)
.
As you can see from above, this pattern translates nicely to bash glob syntax. We can use this as a notation for this detection methods.
@SkuldNorniern If you can translate all of above regex patterns to glob syntax that would be great. Keep collecting patterns like these from resources listed in #1 and report back if you find something that cannot be handled with this method.
Also, If you have time to do so, please do look into the docs of each format and verify where they are actually located rather than just staring at those strange regexes. Some of them seems to be malformed in my eyes.
Here's the collection of WieeRd file name schemes/formats that are quite odd to handle, the list will update regularly
etc/crontab
nginx/**/*.conf
mpd\\.conf$
git(config|modules)$|\\.git/config$
^(.*[\\/])?git\\-rebase\\-todo$
\\.(ini|desktop|lfl|override|tscn|tres)$|(mimeapps\\.list|pinforc|setup\\.cfg|project\\.godot)$