Closed henriquecbuss closed 2 months ago
This is a very interesting proposal because I think it would allow elm-review
to get even more information than what it currently does (and more information: less false positives and negatives, and more potential rules).
I think that is this system works out well, it could replace the README visitors which will end up being a special case for this kind of visitor.
As you say and as discussed outside of the thread, it's quite easy (with the withExtraFiles
visitor) for the rule to access the data, but I don't yet know exactly how the rule would be able to tell the tool which files it want.
An API I'm thinking of could be as follows:
-- Definition
withExtraFiles :
(List { path : String, content : String } -> projectContext -> ( List (Error { useErrorForModule : () }), projectContext ))
-> ProjectRuleSchema schemaState projectContext moduleContext
-> ProjectRuleSchema { schemaState | hasAtLeastOneVisitor : () } projectContext moduleContext
-- Usage
ruleSchema
|> Rule.withExtraVisitor <files to fetch> extraFilesVisitor
-- ...
extraFilesVisitor : List { path : String, content : String } -> ProjectContext -> ( List Error, ProjectContext )
extraFilesVisitor files context =
-- ...
For the "files to fetch" part, I think there are many ways we could want things:
To choose these, we could have an API like
withExtraVisitor [ ExtraFiles.file "translations.json", ExtraFiles.glob "translations.*.json" ] extraFilesVisitor
I feel like this should be something that is sufficient. But maybe this is actually too complex, and [ "translations.json", "translations.*.json" ]
is just as good? :thinking:
You asked about fixes. I think it could be valuable to have the tool offer fixes for those files in order to automate more things.
That said, I worry about the security aspect. For instance, should a rule that you install from a package be able to write to /etc/ssh/ssh_config
? Even if there is a prompt, this can easily go wrong and be disastrous.
I don't know if there are any security risks even for allowing you to read data.
A few options for this. I don't know whether they should help prevent fixes in those files, or even getting access to that data (but I'll just say "look at" to cover either case).
The first option, which would not work, is to have a builder function that sets an allowlist on what files the rule can access
config =
[ SomeRule.rule
|> Rule.allowLookingAt [ "translations.json" ]
]
The reason why this wouldn't work is because rules would be able to use this function in their definition
rule =
Rule.newProjectRuleSchema "HarmfulRule" ...
-- ...
|> Rule.fromProjectRuleSchema
|> Rule.allowLookingAt [ "/etc/ssh/ssh_config" ]
So if we have an allowlist, it would have to be defined outside of a rule level (because, that's a bit outside of the reach of the rule)
So here are options that I think could work:
elm.json
ReviewConfig.elm
file, where you'd have a top-level declaration "files" (or some better name) next to "config", which would be an allowlist of all the files to access.
files
list". If people don't, then they get false positives/negatives or global errorsMain
file would have to import it, but not import it if it's not there)elm.json
file. elm-review
to be able to be run as is. Requiring more would be annoying to everybody, and would make editor integration not workDoes someone have other ideas?
I think option 6 and 7 are the simplest ones, and option 4 is probably the least intrusive one for the allowlist idea.
If we do go with this option of an allowlist, then maybe in a later major version of elm-review
we could have a config like config = { rules = [ ... ], allowList = [ ... ] }
to ease the compilation work.
With an allowlist, it would give the annoyance where rules would not work as expected, and I think it would still be somewhat be easy for the rule to circumvent things, such as in a following way:
module SomeRule exposing (rule, allowList)
{-| This rule does awesome things.
It does require access to a bunch of files, so
don't forget to import the allowList and use it in your configuration
-}
allowList = [ "/etc/ssh/ssh_config" ]
Please pitch in if you think I'm overthinking this, have some idea or any comment.
Perhaps just allow stuff in the project, but follow symlinks?
This makes the most common case just work, and gives a reasonably straightforward workaround for the other cases: just symlink your directories...
Another idea to consider: use crazy API design!
type FilePath = Opaque
type FileKey = Opaque
concreteFile : FileKey -> String -> FilePath
glob : FileKey -> String -> FilePath
Rule.withExtraVisitor : List FilePath -> visitorDefinition -> ProjectRuleSchema ...
then config changes from List Rule
to FileKey -> List Rule
.
Then it can be easily documented that rules should take FilePath
s as inputs and definitely not ask for a FileKey
...
This is cool! I think this will open the door to adding more elm-review rules to help make library maintenance even less painful in Elm.
Personally I would be fine with solution # 1 - it seems reasonable to me that elm-review is scoped to the elm apps directory.
I feel like this should be something that is sufficient. But maybe this is actually too complex, and [ "translations.json", "translations.*.json" ] is just as good?
I feel like just plain strings are enough. However, if we want an API like you described, perhaps we could get some inspiration from elm-page's DataSource.Glob.
For allow lists, maybe we could consider a mix between your ideas:
Cmd
s, is there an issue in just reading files?)Then it's just a matter of defining what a project is, and how to define the allow list.
For example, to support monorepos, elm-review
could look up the directory tree until it finds a .git
folder (I'm not sure how good this is, as it forces people to use git - but at the same time, elm packages are forced to use github, so is this really an issue?). In case we don't find a .git
folder, default to the folder with an elm.json
.
For the allow list, I like the idea of it being a separate file (and most importantly, being optional). I think a user should only define an allow list if they need one, as to have the least amount of friction possible when setting up rules. If a rule requires specific files in the allow list for it to work, the rule author can emit an error, telling the user which files the rule needs.
@gampleman
Perhaps just allow stuff in the project, but follow symlinks? This makes the most common case just work, and gives a reasonably straightforward workaround for the other cases: just symlink your directories...
I think that would work for some cases, but the case that I think will come most often is when you want to access the data in the parent folder (README.md in example below) or a sibling from the current folder (translations/ for instance).
root
How would symlinks help here? Would you do something like this?
root
Or would you do something like the following? (this is the part where I may have finally understood you @gampleman :sweat_smile: )
root
I guess that could work? :thinking:
It would be kind of surprising to tell users that they need to create a symlink for this. But hey, maybe why not :thinking: It would definitely ease the problem for now!
Additional use-case that came up: In some cases we don't want to know the contents, but we just want to know which files exist. For instance, if we want to know whether a reference to static/some-image.png
is okay, we want to know whether that file exists.
This is supported as of 2.14.0! I think it covers all of the use cases outlined here.
Amazing!!! I'm out of the Elm ecosystem nowadays, but it's great seeing this progress! Great work as always @jfmengels ❤️
It would be nice if we could interact with files other than
.elm
files.Example use case
Many apps get their translation texts from JSON files. If we could interact with these files, we could check for some things:
For 1. and 2., errors are related to the JSON files, so we could either report errors with
globalError
, or have some variation likeerrorForFile
anderrorForFileWithFix
(I'm not sure if reporting errors or offering fixes for arbitrary files is in scope forelm-review
).Item 3. only needs to have the content of the translation files in the context. Other than that, it would look like a regular rule (just check if the key string is a valid entry in the context).
I think the main point of discussion is how to get information from these files into a rule. I really liked the additional visitor idea:
This visitor allows any file format (not only JSON) to be visited, so it's a 👍 from me!