fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
344 stars 21 forks source link

Syntactically describe dependencies between files (by using '#requires', '#load' or extending 'open' syntax) #309

Open baronfel opened 8 years ago

baronfel commented 8 years ago

Submitted by Daniel Bradley on 8/20/2014 12:00:00 AM
7 votes on UserVoice prior to migration

With F# becoming more and more multi-editor and cross-platform, it is becoming increasingly difficult to teach all build/edit tools about F#'s file order. The F# community are currently struggling to "update" each new build/edit tool to understand that F# actually needs a file order.

Part of the problem is that there is no standard textual way to specify this file order except as command line arguments, and these are not stored in an editable form. There is no standard way to specify the F# file order. We need an (optional) solution to this problem that is closer to home and doesn't involve modifying build/edit tools.

This proposal is one of three alternatives to deal with this problem in the F# language/compiler itself. The specific proposal covered by this UV entry is to use #requrie declarations within files to specify a file order

# require "Helpers.fs"

Then, the compiler could automatically infer the dependencies between the files and not require files to be passed in pre-ordered by least dependent first.

Related alternative

Original UserVoice Submission Archived Uservoice Comments

nelak commented 7 years ago

I recall once using the compiler to build .fsx which was using the #load directive to reference the files which produced a working .exe without problems. I found it quite intuitive after having worked with nodejs. I believe that pushing this forward would improve the xplat story, remove the dependency on fsproj files and also eliminate the file ordering issues.

nelak commented 7 years ago

Also I'm all in on aliasing #load with #require which could also be used to extend to extend it with the functionality FS-1027

JeroMiya commented 2 years ago

Of the three proposals, I think this one has the most potential, but there is an issue that needs to be addressed. Note that #load only works in interactive mode. This makes sense because of the way interactive mode works. It's just loading and executing the file as if you went and typed the file out by hand. What happens if two .fsx files have a #load statement for the same file? Well since it's interactive mode, the file gets executed again. If any symbols are defined in the loaded file, they're defined again. Interactive mode lets you do that.

But in non-interactive mode, it's an error to redefine a symbol, so we know that just implementing something like #load directly is not an option. Next, you might be thinking, well what if we fix it so that we don't recompile the same file if it's already been loaded? In other words, say the compiler scans for #load statements, builds a dependency graph, and calculates a file order based on the dependencies between files? Then it would just use that file order as if the developer explicitly chose that file order in the .fsproj file. There's a problem with this too! And the problem stems from there being potentially more than one ordering that would satisfy all the dependencies of a given set of files.

Why is that a problem? Well, remember that F# is a language with global type inference. So, just as one example, let's say you define a record with fields named A and B. Later, you define a value with fields A and B, but you don't annotate the value's type. Well, since you've previously defined a type that matches the value, F# will infer the type you defined earlier. If there are TWO different records with the same fields named A and B however:

type RecordA = { A: float; B: float }
type RecordB = { A: float; B: float }
let value = { A = 1.1; B = 2.2 } // value is of type RecordB

The value is inferred to be the last matching type, in this case RecordB. Now, imagine that RecordA and RecordB are defined in two different files. That means that whichever file is last in the final compile order gets chosen as the inferred type of the value here. But, now that we know there are multiple potential file orders satisfying a given set of file order dependencies, we come to the realization that whichever one of the "potentially correct" file orders the tooling picked could potentially change the meaning (in this case the inferred type of a value) of the final output. That's a problem, and it's solved today by giving the wheel to the developer and requiring them to pick a specific order. But if we try to implement automated tooling, they don't have that option, with this proposal as is.

I don't have a good solution to this problem, or even know if there is a solution. The only thing I can think of would be to maybe make it possible to partially isolate some files from other files (and again, likely this would just be impossible). I'll try to explain what I mean. Let's say you have three files: a.fs, b.fs, and program.fs. Then program.fs declares both a.fs and b.fs as dependencies. So, we have two potential orders that solve this dependency graph: a.fs -> b.fs -> program.fs and b.fs -> a.fs -> program.fs. And lets' say for the sake of argument that one of these orders is "correct", and the other results in a logic error resulting from incorrect type inference, but still compiles.

Ideally, the behavior we desire is that it should not matter what order, in program.fs, I declare a.fs and b.fs as dependencies. Say we had a naive ordering algorithm that just used the order you write your #import or #requires statements, when both orders are valid. Well, if I'm the one writing program.fs but not the one writing either a.fs or b.fs, then I would be required to dig into the implementation details of those two files to figure out which order is "correct". Who knows, maybe a change in a.fs or b.fs causes the "correct" order to change, or even makes both orders result in a logic error. Here, assume the authors of a.fs and b.fs don't know anything about either their other peer dependency or program.fs and don't want to know. They're ostensibly independent files.

So what do we do? The only thing that could possibly work is if, regardless of which of a.fs or b.fs gets compiled first, the second file must compile as if nothing from the first file was compiled or run. Almost like each file (to be more specific, each dependency tree defined by that file and its own dependencies only, and theirs and so on) was in its own assembly. Only once both files are built, would the order matter, and only in program.fs - the file that imported the other two.

I doubt this would be possible, but that's the only thing I could think of that might address the problem.

smoothdeveloper commented 2 years ago

I am personally thinking about "shortcuts to slash through implementation" that involve:

and once we hacked something, we figure out a way we would like the F# ecosystem (the one that also works outside of dotnet) to define a top level language approach, that fits either in the code files themselves, in a separate "code file list" format, or sorts itself out for some use cases, and possibly, by also integrating other IL compilers in the pipeline, in this "code file list gets compiled to IL" business (because dotnet IL business is still a big big deal for F#).

We should explore the different module systems, as they exist in languages such as Haskell, Rust, Javascript, Swift, etc. and figure out where we'd like to take surface it in F#.

dsyme commented 2 years ago

Talking with @nojaf there's interest at looking at this again. There are real tooling advantages in optionally getting a strong graph of file dependencies within a project - parallelization becomes possible, intra-project incremental checking becomes more minimal.

The proposal would be roughly as follows

  1. Optionally allow globbing in project files
  2. Optionally allow explicit file-to-file references using import * from "relative-file-path", e.g.
import * from "../AbstractIL/il" 
import * from "../Utilities/lib" 
import * from "../AbstractIL/ilx" 
import * from "./CompilerGlobalState"
import * from "../Facilties/LanguageFeatures" 

open FSharp.Compiler.AbstractIL
open Internal.Utilities.Library
open FSharp.Compiler.AbstractIL.ILX
open FSharp.Compiler.CompilerGlobalState
open FSharp.Compiler.Features

The file extension would not need to be given, and if there's a signature file that would be implicitly used.

No cycles would be allowed (or, allowing cycles would be a later possibility).

Note that TypeScript uses file-to-file references, and the ergonomics are actually really nice, e.g. this, combined in the import directive:

import * as options from "./options";
import * as octoresponses from "./octoresponses";

Aside: We could later add https://github.com/fsharp/fslang-suggestions/issues/492 as a separate thing.

That is, we would extend import to cover limited open as well as in TypeScript. If we did it might look something like this:

Unselective (rough equivalent of open on the imported contents):

import FSharp.Compiler.AbstractIL.* from "../AbstractIL/il"
import FSharp.Compiler.* from "../TypedTree/TcGlobals"
import FSharp.Compiler.Features.* from "../Facilities/Features"
import * from "./SomeOtherFile"

or selective

import FSharp.Compiler.AbstractIL.ILTypeDef from "../AbstractIL/il"
import FSharp.Compiler.TcGlobals from "../TypedTree/TcGlobals"
import FSharp.Compiler.Features.{LanguageFeature, SomethingElse} from "../Facilities/Features"

There would however be a lot of details to work out in this, especially whether aliases where allowed.

vzarytovskii commented 2 years ago

@dsyme have you considered extending open syntax, instead of introducing new one, for example:

open FSharp.Compiler.AbstractIL
// or
open FSharp.Compiler.AbstractIL from "../AbstractIL/il"
// or
open FSharp.Compiler.AbstractIL.* from "../AbstractIL/il"
// or
open FSharp.Compiler.AbstractIL.{ILProp,ILMeth} from "../AbstractIL/il"
Smaug123 commented 2 years ago

A possible improvement might be to make the import relative to the fsproj, not the current file (on the assumption that each file is present in only one fsproj). That way you can freely restructure by e.g. moving files into subfolders.

Smaug123 commented 2 years ago

A completely different approach to the same thing, to flesh out the fileorder.txt idea: in the fsproj.

<Compile Include="thing.fs">
  <Reference "bar.fs" />
  <Reference "other/bar.fs" />
</Compile>

It is one of my least favourite ideas ever, but might be the least surprising proposal I've seen so far here?

Smaug123 commented 2 years ago

Aside: any design which ties modules to files, by the way, has to cope with things like "a type in the namespace that is not in a module"; I don't know about other people, but I do this all the time, it's rare for me to put a type in a module.

dsyme commented 2 years ago

@dsyme have you considered extending open syntax, instead of introducing new one, for example....

The problem with open is that it is inherently non-selective. It says "take this thing and include * from it", so

    open A 

==

    import A.*

It's really hard to make open selective. Perhaps

    open {Thing, OtherThing} from A

or

    open A.{Thing, OtherThing}

but honestly both of those look to me like you're opening all the contents of both A.Thing and A.OtherThing (rather then selectively importing Thing and OtherThing from A).

For this reason I'd honestly be more inclined to have an entirely new declaration.

dsyme commented 2 years ago

A possible improvement might be to make the import relative to the fsproj, not the current file (on the assumption that each file is present in only one fsproj). That way you can freely restructure by e.g. moving files into subfolders.

Possibly, and it's worth considering, though there are pros and cons both ways (e.g. same-directory references would be very common, and file-relative references those don't need to be adjusted when you rename directories). I'd also be loathe somehow to diverge from TypeScript here, because the Fable existence of F# would really benefit from non-divergence. And we'd also want to consider the having the above work as replacement for #load in scripts.

Also, we should be considering future manifestations of F# (again, Fable based, F#-to-Python etc) where there are no project files at all (TypeScript, Python and most other languages barely have them at all these days).

dsyme commented 2 years ago

Note whenever this has been suggested before the idea has been received very negatively (I know @KevinRansom hates the idea :)). However I've contributed some TypeScript recently and I've been surprised how relatively pleasant file-to-file references are.

@nojaf's current thinking is that this would be an entirely optional thing which, when adopted in very large enterprise codebases (like those of the company employing him to work on F#) could unlock better compiler performance and type-checking parallelization. Some tool would presumably automatically add the file-to-file references in the first place.

Please let me know what you think about it from that perspective. That is, the feature would be entirely opt-in, and linear order still fully normal, and the only purpose of using the feature initially would be to give an intra-project dependency structure.

TheJayMann commented 2 years ago

This is something I believe would be quite useful for those willing to opt in, even if only for not having to arbitrarily decide which file should go first between two unrelated source files.

Also, on the tooling side, this has the potential for modifying autocomplete suggestions, either reducing them based on explicitly declared dependencies, or expanding them to make introducing new dependencies easier (similar to how autocomplete today will add an open statement if not present).

Tarmil commented 2 years ago

Regarding the possibility of reusing open, I think Haskell's import is an interesting comparison. Its syntax to import everything from a module is similar to F#'s open, and its syntax to import specific items from a module feels clear enough to me.

-- Import everything inside Foo.Bar:
import Foo.Bar

-- Import only Foo.Bar.a and Foo.Bar.b:
import Foo.Bar (a, b)
sergey-tihon commented 2 years ago

Rust has a pretty similar syntax with use directive

image

that's how it may look in real-life, aka import all requeries types in one statement

use bevy::{
    ecs::{
        event::{Events, ManualEventReader},
        schedule::SystemSet,
    },
    input::mouse::{MouseMotion, MouseWheel},
    prelude::*,
    render::{
        camera::Camera,
        camera::CameraProjection,
        camera::{ActiveCamera, Camera3d, PerspectiveProjection},
    },
    window::Windows,
};

we cannot use use but UseTree looks reasonable.

import A.{Thing, OtherThing}
import A.Thing as OtherThing
import A.{Thing as OtherThing, OtherThing as Thing}
import A.{Thing, OtherThing.SubThing}
Lenne231 commented 2 years ago

Maybe it is possible to reuse open as a keyword if the syntax starts with the path to the file. Starting with the path to the file is also important for tooling like source completion.

Also the word "selectively" has been used a few times in this thread. Maybe we could just use the select keyword in combination with open.

open "./Types" select { A; B }
open "../Helpers" select { transformValue }

Maybe this can also be used with the current open statements.

open Namespace1.ModuleX select { a; b } 
AngelMunoz commented 2 years ago

My two cents in this is that, if a version of this goes in, it can also help with the Simple F# Theme, if the dependency graph can be computed by explicitly stating the dependencies, we could somehow ditch file ordering which is one of the most common complaints I've heard of F# when getting started, which is sometimes seen as unnecessary and old from anecdotal experience

Now whether the above happens or not I don't think it is entirely related to this suggestion but I find it as one motivation to add this into the language

laenas commented 2 years ago

Adding a new keyword seems to add complexity relative to simply enriching the grammar around open - as @Lenne231 notes, there seems to be a way to disambiguate: open (string literal) from the existing syntax of open (namepsace) or even add the additional disambiguating grammar of open (token) from (string literal)

which would minimize keyword bloat.

All of that aside, in reference to @AngelMunoz - I'd be curious about how this helps clarify top-down ordering, which while perhaps confusing to newcomers, is also one of the stronger more stringent points of maintaining a conceptual strength of the language. If we're going to allow mixed open/import declarations, it would seem that we end up with worst-of-both-worlds, rather than attempting to identify ways to improve incremental checking.

AngelMunoz commented 2 years ago

I would expect that type checks are still top-down as they are today. Regardless of the experience of the user, error messages derived from a feature like this should reinforce the idea of the top-down compile approach

I guess how the type checking is performed should be stated as well if this advances further

I can think of an scenario right away in my mind:

Current file ordering would type check FileA -> File-B -> File-C -> File-D

With explicit imports:

Similar scenarios may occur if the feature is used often in the wild so I think @laenas makes a great point where this could lead to a worse experience

JeroMiya commented 2 years ago

@dsyme Does your proposal imply any new form of isolation for included files, or is the end result the same as a manually ordered list of files - with just the ordering calculated from the file-to-file references?

Even if you don't allow cycles, there could be ambiguities that result from some file dependency trees:

Consider the following dependency tree:

A.fs
 import B.fs
 import C.fs

B.fs
  import D.fs
  import E.fs

C.fs
  import E.fs
  import D.fs

If we are just auto-calculating a traditional "top-down" file order based on import statements, and then building as we do today, then this dependency tree has a problem: if you're not allowed to re-order import statements to satisfy dependencies, then there are no solutions - B.fs and C.fs conflict with each other. On the other hand, if you allow import statements to be re-ordered, then there are at least two solutions (more if you count all potential reordering of import's), but each solution :

D.fs -> E.fs -> B.fs -> C.fs -> A.fs (violates the order in C.fs, E.fs type inference affected by contents of D.fs despite no `import`)
E.fs -> D.fs -> B.fs -> C.fs -> A.fs (violates the order in B.fs, D.fs type inference affected by contents of E.fs despite no `import`)

Further, I'm not sure how just picking a file order and compiling as normal makes it possible to parallelize the build more than you already can?

On the other hand, it would make more sense if your proposal implied that each file becomes isolated w.r.t. type inference from any other files that it doesn't explicitly import. If that's the case, then you should be able to parallelize the build as far as the tree allows you to, so for the above example:

E.fs and D.fs can be built in parallel
Then B.fs and C.fs can be built in parallel
Then A.fs can be built

This would be a similar to, say as if you moved every .fs file into its own .fsproj and replaced import statements with assembly references at the project level. That seems like a much bigger change to the language than originally discussed.

ghost commented 2 years ago

Regarding the possibility of reusing open, I think Haskell's import is an interesting comparison. Its syntax to import everything from a module is similar to F#'s open, and its syntax to import specific items from a module feels clear enough to me.


-- Import everything inside Foo.Bar:
import Foo.Bar

-- Import only Foo.Bar.a and Foo.Bar.b:
import Foo.Bar (a, b)

I really like this idea from @Tarmil, but my 2 cents can go even further (I'm new here, so don't know about existing implications not to allow this): open [a;b] from "../foo/bar or open "../foo/bar" with/select [a;b] In here we can still use open syntax, but provide a better understaning of a path we need to open from. It also can then analyze the path and provide completion for us. I stick to the latter, because it is more appealing to me, when you first point a file/folder and then select what you want from it. The former is used in TypeScript, and it makes me constantly switch from the end of the line back to import just to add {something} in the middle of a statement.

Lenne231 commented 2 years ago

All of that aside, in reference to @AngelMunoz - I'd be curious about how this helps clarify top-down ordering, which while perhaps confusing to newcomers, is also one of the stronger more stringent points of maintaining a conceptual strength of the language. If we're going to allow mixed open/import declarations, it would seem that we end up with worst-of-both-worlds, rather than attempting to identify ways to improve incremental checking.

This is a valid point. Beside the proposal to define the dependency graph in the code files, there should also be a proposal how this could be done in the project files. Something like this has already been proposed in another discussion I can't find right now, but it was something like

<Parallel>
  <Compile Include="TestCase1.fs" />
  <Compile Include="TestCase2.fs" />
  <Compile Include="TestCase3.fs" />
</Parallel>

Also

<Parallel>
  <Compile Include="TestCase1.fs" />
  <Compile Include="TestCase2.fs" />
  <Sequential>
    <Compile Include="Utilities.fs" />
    <Compile Include="TestCase3.fs" />
  </Sequential>
</Parallel>

should be possible.

Maybe the logic with the file extension could also be applied, so that

<Compile Include="Utilities\FileSystem" />

instead of

<Compile Include="Utilities\FileSystem.fsi" />
<Compile Include="Utilities\FileSystem.fs" />

is allowed as well.

dsyme commented 2 years ago

@dsyme Does your proposal imply any new form of isolation for included files, or is the end result the same as a manually ordered list of files - with just the ordering calculated from the file-to-file references?

If explicit imports are used, then the only content in scope would be that explicitly imported.

kalekseev commented 2 years ago

import { a, b } from "../AbstractIL/il"

This import syntax considered a design mistake in JS because autocompletion doesn't work after import { so devs have to type import {} from "../AbstractIL/il" then return back to {}. There's no such problem in python that uses from ..AbstractIL.il import a, b

T-Gro commented 2 years ago

That is a great catch, analogous to the design mistake of SQL select statement.

What I would see as a nice UX flow: from ( autocomplete for folder & file navigation ) ../Utils/Common
open ( autocomplete for namespaces and modules in that file immediately and after each semicolon, likely trimmed off the file's top-level module/namespace statement to keep them short )

That would also eliminate the need to have highly duplicated open statements right below it.

dsyme commented 2 years ago

@kalekseev @T-Gro Thanks, yes, @nojaf said the same thing in-person.

That said, I'm still ambivalent. Firstly, there's surely huge value for Fable in using the same syntax as JS.

Secondly I find this visually hard to read - partly simply due to the lack of alignment for the open - I know it seems like a small thing but this seems to affect me more than a lack of alignment in the file paths. And that's before making the open construct even more powerful through selective opening or aliasing.

from "../AbstractIL/il" open FSharp.Compiler.AbstractIL
from "../Utilities/lib" open FSharp.Compiler.Utilities
from "../AbstractIL/ilx" open FSharp.Compiler.AbstractIL.ILX

Somehow when I look at this I feel like need to run through the open declarations as if they are a program, or else it takes me time to look through them. Also it's unclear if they would be naturally ordered by namespace name or file name. Overall, is the above code really clearer than separated lines?

from "../AbstractIL/il"
from "../Utilities/lib" 
from "../AbstractIL/ilx" 

open FSharp.Compiler.AbstractIL
open FSharp.Compiler.Utilities
open FSharp.Compiler.AbstractIL.ILX

or

#require "../AbstractIL/il"
#require "../Utilities/lib" 
#require "../AbstractIL/ilx" 

open FSharp.Compiler.AbstractIL
open FSharp.Compiler.Utilities
open FSharp.Compiler.AbstractIL.ILX

Every time in the history of F# that we've crammed constructs on to a single line we've usually regretted it and in the end have fantomas format to multiple lines for clarity.

Anyway, that's just to say why I feel a bit queasy about any single-line form that puts the filename first.

dsyme commented 2 years ago

Also please note my comments on open here - for me the word is fundamentally associated with non-aliasing, non-selective import of everything in a namespace, module or type. I find it aesthetically difficult to think of any variation of it as a selective or aliasing construct.

Tarmil commented 2 years ago

That being said, putting the file first doesn't force everything onto the same line. Maybe from could start an indented block in which you list what you want to make available from that file:

from "../AbstractIL/il"
    import FSharp.Compiler.AbstractIL.*
    // or equivalently:
    open FSharp.Compiler.AbstractIL
from "../Utilities/lib"
    import FSharp.Compiler.Utilities.[ debug; verbose ]
    import FSharp.Compiler.Utilities.Bool.order
    import FSharp.Compiler.Utilities.AsyncUtil.*
auduchinok commented 2 years ago

That said, I'm still ambivalent. Firstly, there's surely huge value for Fable in using the same syntax as JS.

Why making F# mimic JavaScript is a good thing, in general?

There're many differences between languages that a person needs to take into account when using multiple languages in a project anyway, and removing just one or two wouldn't change much for the actual development process, so it's probably not worth it. And if we tried to copy another language in more ways, then it could make it more difficult to learn and use the language, as F# own identity would be lost in more ways as well. Unless the copied things fit naturally into the language.

auduchinok commented 2 years ago

What if we use logical paths, as defined in fsproj, instead of paths relative to the files themselves?

This feature seems to be needed the most in bigger projects with many files, and such projects tend to use folders and sometimes even add files via links. In bigger projects it's going to be crucial to see the connection between a reference to a file and the file in the project tree in the IDE easily. Unlike when working with scripts where relative file system paths are needed more.

So instead of

#require "../AbstractIL/il"
#require "../Utilities/lib" 
#require "../AbstractIL/ilx" 

I propose the paths to look more similar to how the look in IDEs:

#require "AbstractIL/il"
#require "Utilities/lib"
#require "AbstractIL/ilx"

The keyword or the whole syntax can be different in the final design, of course, it's only about the paths. 🙂

dsyme commented 2 years ago

@auduchinok I put some thoughts about file-relative v. project-relative here: https://github.com/fsharp/fslang-suggestions/issues/309#issuecomment-1290885964

I guess you could imagine some symbol for "project root"

#require "{project}/AbstractIL/il"
#require "{project}/Utilities/lib"
#require "{project}/AbstractIL/ilx"

Why making F# mimic JavaScript is a good thing, in general?

I agree, it's more a point of not doing unnecessary divergence to industry standards without decent reasons. JS is one of the few languages that has file-to-file references within projects so it's an interesting comparison point for us.

dsyme commented 2 years ago

@Tarmil It's certainly true that a single file could have many corresponding import.

Tarmil commented 2 years ago

For the project root, using an absolute path should work, no? eg "/AbstractIL/il". I don't think it's ambiguous, I can't imagine any other reasonable meaning for an absolute path.

TheJayMann commented 2 years ago

While absolute paths do make sense, I was also thinking that "home" paths would as well.

~/AbstractIL/il

Can't really think of either one having any real advantages over the other, with the exception that absolute paths would require one fewer character.

seanamos commented 1 year ago

@Tarmil

For the project root, using an absolute path should work, no? eg "/AbstractIL/il". I don't think it's ambiguous, I can't imagine any other reasonable meaning for an absolute path.

My initial impression seeing that would be that it references the absolute path /AbstractIL/il, so absolute from the drive root.

In the same mindset ~/AbstractIL/il could be confused as referencing the user's home folder. I'm aware ~ has been a convention in ASP.NET for referencing the web root, but when referring to the file system, it has long had other meaning.