jfmengels / elm-review

Analyzes Elm projects, to help find mistakes before your users find them.
https://package.elm-lang.org/packages/jfmengels/elm-review/latest/
Other
255 stars 13 forks source link

Disallow referencing functions and types that are not yet defined #55

Closed MartinSStewart closed 3 years ago

MartinSStewart commented 4 years ago

What the rule should do: Disallow referencing functions and types that haven't been defined yet when reading the document top to bottom. This is inspired by how F# works. One difference though is that F# has a rec keyword to allow for defining functions and types that have mutual references. Since there is no such keyword in Elm, any functions or types with mutual references would be exempt from needing to be placed before/after eachother.

Example of things the rule would report:

Example 1:

type alias A = { b : B }

type alias B = Int 

A has a reference to B which is defined later in this module. Move the type declaration for B to somewhere before A. Automatic fix:

type alias B = Int 

type alias A = { b : B }

Example 2:

a = b + 1

b = 1

a has a reference to b which is defined later in this module. Move the function declaration for b to somewhere before a. Automatic fix:

b = 1

a = b + 1

Example of things the rule would not report:

a value = b value + 2

b value = a value + 1

These functions both have a reference to eachother so their relative order doesn't matter

When (not) to enable this rule:

People who find this restriction annoying or don't find that it makes it easier to navigate code. I'm personally unsure if I like it or not but I think it would be a good experiment to see how well this works in Elm.

I am looking for:

jfmengels commented 4 years ago

From what I understand, (in JavaScript terms) this boils down to not make use of hoisting. An equivalent of this in ESLint would be no-use-before-define.

Is this feasible to do in elm-review?

This looks doable to me.

What other people think about this restriction. Has anyone work with F# and found this restriction useful/annoying there?

I personally dislike this idea. Here is why:

If you enforce this, you might need to move (a lot of?) code around for unrelated changes. This is not a deal-breaker, but something to keep in mind.

I used to write JavaScript like this (with the no-use-before-define rule enabled), but nowadays I tend to write the code almost the exact opposite way. With the proposed rule, you'd need to put the details of your implementation at the top of the file and the higher-level things (update, view, ...) at the end (of their respective sections), with main being the last one there. But when you open a file, you want to see the higher-level information (what is exposed, what does the update/view/... functions "globally" do) at a glance.

If we had this same kind of architecture for the project's file hierarchy, the Main file would be nested really deeply instead of being near the root of the proejct, which sounds ridiculous to me.

If I would enable this kind of rule (and I guess I kind of subconsciously do), I would almost do it the other way around. If you wish to define a type, you need to first have defined the other types they are used. Same for functions. I am not sure I would want this rule, but would be more open to personally.

A more concrete representation of these

-- A
type Foo = Bar | Baz
type Model = { foo : Foo }

updateFoo : Foo -> Foo
update : Msg -> Model -> Model -- uses updateFoo

-- B
type Model = { foo : Foo }
type Foo = Bar | Baz

update : Msg -> Model -> Model -- uses updateFoo
updateFoo : Foo -> Foo

A is what you'd end up with with the proposed rule, B is what I personally do. @MartinSStewart how are you usually structuring your files? A, B or something else?

MartinSStewart commented 4 years ago

If you enforce this, you might need to move (a lot of?) code around for unrelated changes. This is not a deal-breaker, but something to keep in mind.

Yeah, there's a risk of large merge conflicts if it's applied to existing code. Also any grouping comments like --- UPDATE CODE --- won't end up in the right place.

If I would enable this kind of rule (and I guess I kind of subconsciously do), I would almost do it the other way around.

It could be configured in the rule so that people can choose the direction they like.

One thing I realized I wasn't clear on though, the rule as I had imagined it, would also check for references between types and functions. So in your example, Model and Foo would need to come before update and updateFoo. Effectively your types would end up towards the bottom of the file (or at the top of the file depending on the dependency direction).

I guess that restriction could be removed so that you can have both high level functions and types at the top of your file though.

how are you usually structuring your files? A, B or something else?

Right now I tend to structure my files in an adhoc manner. That said, I do tend to put type definitions at the top of the module.

sparksp commented 4 years ago

With the example of a rule module, I would personally want the rule first and everything else under it, I want to follow this with any visitors near the top, and anything they call underneath, and so on. So the bottom of the file usually ends up with an error type function. I'm not strict in following this, but I would be happy to be more strict about it.

So, I could support this rule, but it would need to have a configurable direction so that I can have all my references after they're called instead of before 😉

MartinSStewart commented 4 years ago

@sparksp With that ordering, all your types would end up near the bottom of the module. Is that something you prefer or do you think an exception should be made for types (in other words, a type doesn't need to come after any functions that reference it).

sparksp commented 4 years ago

Hmm, that's a good point. I do often like to put my types near the top. Maybe that can be configurable too?

MartinSStewart commented 4 years ago

So far it seems like we're all in favor of types at the top and definitions coming after they are referenced. Maybe it makes more sense to have the rule only do that, no configuration.

sparksp commented 4 years ago

Maybe - I'm thinking there may be an extra level here too... Should we consider what's exported from the module? I probably want what's exposed up first, before anything private that it uses (types or functions).

module NoRule exposes (rule)

import Review.Rule as Rule exposing (Rule)

-- public interface
rule : Rule

-- private types
type Context

-- private functions
initialContext : Context

declarationListVisitor : List (Node Declaration) -> Context -> (List (Rule.Error {}), Context)

Then if you consider a typical "Page" module, do you want to group your private functions near the public function that depends on them? Here's an example of total separation:

module Page.Blog exposing (Flags, Model, Msg, init, update, subscriptions, view)

import Data.Author exposing (Author)
import Data.Post exposing (Post)

-- public interface (in order of exposing)
type Flags

type Model

type Msg

init : Flags -> (Model, Cmd Msg)

update : Msg -> Model -> (Model, Cmd Msg)

subscriptions : Model -> Sub Msg

view : Model -> Html Msg

-- private types
type RemoteData e a

-- private functions (in order of use)
updatePostBody : String -> Post -> Post

viewPost : Post -> Html Msg

viewAuthor : Author -> Html Msg

Compared to grouped...

module Page.Blog exposing (Flags, Model, Msg, init, update, subscriptions, view)

import Data.Author exposing (Author)
import Data.Post exposing (Post)

--- FLAGS
type Flags

--- MODEL
type Model

type RemoteData e a

--- MSG
type Msg

--- INIT
init : Flags -> (Model, Cmd Msg)

--- UPDATE
update : Msg -> Model -> (Model, Cmd Msg)

updatePostBody : String -> Post -> Post

--- SUBSCRIPTIONS
subscriptions : Model -> Sub Msg

--- VIEW
view : Model -> Html Msg

viewPost : Post -> Html Msg

viewAuthor : Author -> Html Msg
jfmengels commented 4 years ago

do you think an exception should be made for types (in other words, a type doesn't need to come after any functions that reference it).

Yes.

The structure I tend to have is one that gives me the most overview at the top, or in other words from the most general information to the most specific one. So the main is usually at the top, and then for each section (view, update, etc...) I define the types then the function, then its helpers. It basically looks like the grouped version that @sparksp mentioned.

- main
# Model section
- type Model
  - Types for the sub things in Model
- init
# Update
- type Msg
  - Types for the sub things in Msg
- update
  - update helpers
# View
- view
  - view helpers, with for each one the types it will use right above it

The problem with having types be defined before their use above their use in functions, is that you'd need to define some types (at least Model, Msg and Flags) before you can see the main function. I'm noticing that my init also uses Msg, and this wouldn't work. These are the two problems I would personally have if I had to define types before they are used as far as I can think of for now.

I do not think that all types should be at the top before any functions (I'm not clear on whether that was what you had in mind), because you'd increase the distance between the definition and where it is really used. Also, some types are used only in details of the implementation and we don't really want to care about them at the top of the file (nor have to go back there to edit them).

My personal preference would be to make sure that the definition of a type happens before it appears in another type definition (unless mutual use), and that the definition of a function appears before its use (unless mutual use). But not make types and values interdependent.

Maybe it makes more sense to have the rule only do that, no configuration.

I agree that no configuration would be nicer. I do not see the value in allowing to change the order of definition (definitions before uses vs uses before definitions) at least.

MartinSStewart commented 4 years ago

Should we consider what's exported from the module?

Maybe this could make a good tie breaker if two functions don't have a preferred ordering relative to eachother.

Otherwise, I prefer that a definition always comes after it's reference (with a possible exception for types vs functions). The value I see with this rule is that it makes it easier to navigate code. That's made more difficult if there's an exception based on if the function is exposed. In practice I think exposed functions will tend to be near the top anyway.

As for whether types should always come before functions or try to be as close as possible to the highest level function that references them, I'm not sure. I think the rule will need to be implemented and tried out to see which organization is most useful.