dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.89k stars 782 forks source link

Split FSharp.Core in a separate repository #861

Closed enricosada closed 8 years ago

enricosada commented 8 years ago

that's my big question from last standup. If not possible np, but i think it need a discussion at least, because i think is really a good idea for this repo.

Why

make it easier to mantain, contribute ed evolve the FSharp.Core as nuget package And this repo too (compiler/vs) because less code it's good

about the nuget package

The FSharp.Core nuget package is going to become much more important than before, the gacc'd assembly is from a world without nuget where that assemby was special. It is special, but not too much ( ref corefx libraries ).

It's more easier to let nuget resolve FSharp.Core like every other dependency, than the magic of assembly binding Also, coreclr is going to reference FSharp.Core as a normal nuget package in projects.

this repo status

This repo is really big: FSharp.Core + compiler + fsi + vs extension. That mean it's more difficult to make changes, and more difficult to contribute. Build time and test too. Develop all together helps, yes, but now there is a clear separation of responsabilities. And the .net ecosystem changed a lot. Nuget, gac, xplat, multiple target ( full, portable, coreclr, mono ), github, etc

compiler/fsi/compilerservice/vs extension are entangled ok, but FSharp.Core is not.

Now for example is difficult to test FSharp.Core with multiple versions of compiler. If the repo build only the package, it's easy to do a build matrix on compilers ( HEAD, stable, dotnetci, etc ) and easier to do perf testing, lots of pro

pro:

cons:

possible flow:

all that pretty much changing only build the script, because it already works. After it's ok, we can:

All that without speaking about fsharp/fsharp, i think the split it's a good idea for Microsoft/visualfsharp.

Bonus track

That's another issue, i know, but someone will reply about that so..

The winter is coming, Microsoft team is going to publish a nuget package for FSharp.Core corelcr, and there is already a community FSharp.Core ( thx guys btw, i use it )

It's possible to have a single nuget package with desktopclr+coreclr+mono ? nuget support that, package name matter.

There is already an FSharp.Core package from fsharp/fsharp, and it's possibile to use different packages based on platform (coreclr,desktop, mono) but it's going to be a nightmare of conditionals and versions for users ( and library authors )

If we can split the repo and work together on FSharp.Core to make a single nuget package, is going to be awesome. The compiler can continue to use the same flow with fsharp/fsharp and Microsoft/visualfsharp repos, no need to change that.

Maybe it's possible to fix the politics with a shared ownership (community/microsoft) of the fsharpcore repo? Just asking, i know is a difficult argument ( strong naming, ownership, maintanance ), but fsharp was the first open source project of MS, last time the bet has paid off well

Without shared ownership, it's possibile to put both version of assemblies (open for mono,etc and desktop/coreclr) in the same package ( it's a zip ) but it's going to be lot of work of coordination and sync. Maybe is better different nugets, but like that if a project use the community package, and another the microsoft one, nuget maybe can restore all references, but it's a pain for users ( compiler can ignore one of the assemblies )

mexx commented 8 years ago

:+1: Especially for faster evolution and independent release cycle.

kolektiv commented 8 years ago

I'd be in favour, just because... well, it's good. I know that sounds ludicrous and subjective, but I do think that drawing good boundaries around things here will lead to better systems over time. The repo is more daunting than it probably needs to be right now - if it was less scary to get started in terms of compiler work, it might get more contributions, and being able to bring build times down, reduce complexity, coupling, etc. seems like it would help with that.

Things like independent release cycles etc. would be nice bonuses but I suspect there's more than technical reasons which would make that a long conversation. Those might be nice, but even if we didn't get any of the bonuses, having a better factored and more maintainable src base seems like it would be a plus.

7sharp9 commented 8 years ago

Note: The FSharp.Core package from fsharp/fsharp uses the signed versions of FSharp.Core from this repo and the unsigned ones from the fsharp/fsharp repo.

7sharp9 commented 8 years ago

The compiler is also somewhat tied to FSharp.Core. It is special cased in the compiler as its able to get static optimisation conditionals see: https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/FSharp.Core/prim-types.fs#L3897-L3914

enricosada commented 8 years ago

@7sharp9 i understand that, but it's not the same compiled code, no coreclr for example and code can be different ( one is the latest compiled version, the unsigned is compiled from source ). For example, let's add a function to fsharp.core

7sharp9 commented 8 years ago

To add a little more to the pot, FSharp.Core also ship with Xamarin.iOS and Xamarin.Android and Xamrin.Mac.

enricosada commented 8 years ago

The compiler is also somewhat tied to FSharp.Core. It is special cased in the compiler as its able to get static optimisation conditionals see: https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/FSharp.Core/prim-types.fs#L3897-L3914

@7sharp9 i dont understand that. I can pass Fsharp.Core as reference to compiler. The compiler make a special case for an assembly with name FSharp.Core, but it doesnt need to use the same version used by compiler right?

7sharp9 commented 8 years ago

I just meant FSharp.Core when compiled by the compiler gets special treatment, its the only assembly that can use the static optimisation conditionals.

The compiler project also references FSharp.Core if this was a nuget package then then build of the compiler would be a lot quicker too :-)

enricosada commented 8 years ago

ah ok, thx @7sharp9 for clarification.

Compiler build can be from the lkg like now initially, or from the nuget after, so really fast. Same speedup for fsharp core build, a solution without compiler is faster ( no rebuild after fsharp.core change )

dsyme commented 8 years ago

This repo = FSharp.Core + compiler + fsi + vs extension

@enricosada I personally agree that both fsi and `vs extension``should be split out into a Microsoft projects parallel to this repo. However it's not trivial since there is still strong InternalsVisibleTo dependencies in both cases. So there will need to be a forcing function for that to happen though and I don't think it's there yet.

For FSharp.Core, we've traditionally kept it with the compiler since it's closely related. For example, changes in quotations.fs (the quotation format serializer/deserialzier) have to be thought through very carefully and possibly with matching changes in the compiler, carefully guarded by version. For this reason I don't feel the same sense of urgency to split FSharp.Core out and I think it would expose us to other bugs.

enricosada commented 8 years ago

Yes, i discussed split out vs vs fsi/fsc, but was a dead end for the reason you wrote ( internals, compiler service, too much work, etc), but i think fsharp.core is a reasonable possibility.

The fsharp.core assembly can benefit from having a separate codebase from compiler, because if it's separated, it's really easy to test multiple compilers for regressions, just do a test matrix on fsc ( like mono HEAD, mono stable in fsharp/fsharp)

It's also easier for xplat mono/coreclr, because it's only a library

If we want to align compiler drop and fsharp.core drop, that's a problem anyway with users, because maybe i am using an old compiler with newer fsharp.core, or more probably i am using a newer compiler with an old/specific FSharp.Core from nuget.

Make it easier to evolve FSharp.Core (i mean compile,build,fix issues) has lot of pro, because two repo mean two smaller codebase to contribute (with or without a nuget package, but help a lot with nuget package). For example we can compile easy the fsharp.core with a coreclr compiler, and with current compiler, easy (again build matrix).

It's not urgent, only fixes are, but that help make both codebase (compiler and fsharp.core) easier to contribute to, that's a medium term real gain. Possibility of more bugs ok, but if separated, it's easier to send fixes, like current FSharp.Core package does with x.y.z releases. That does mean lower quality, only faster feedback if needed. Nobody want lower codebase. A change in fsharp.core will be tested with multiple compilers easly, not only by current built version or lkg.

Also it's easier to do, because we can clone visualfsharp repo and at beginning only the build script should change, not a real cleanup (at beginning), less regressions

Just compare the splitted out FSharp.Data.TypeProviders with having that inside this repo. It's a lot easier now to try approach the library.

dsyme commented 8 years ago

Yes, i discussed split out vs vs fsi/fsc, but was a dead end for the reason you wrote ( internals, compiler service, too much work, etc),

These things take time :) And we have to remember there is a significant cost to splitting repos too: among other things it forces a clean split with an API something like FSharp.Compiler.Service.

But I personally agree that any serious work on the language service and project system will be very difficult to do in the same repo as compiler+core-library. It's obvious that people prefer to coalesce around single-purpose repos (e.g. like Visual F# Power Tools and most other repos).

However FSharp.Core is a different matter: for some reason it makes me very queasy to think of splitting that out. It's hard for me to rationalize why though. Perhaps it's because I'm looking at making fixes for quotations right now, which is a particularly sensitive area requiring potential compiler/runtime changes.

or more probably i am using a newer compiler with an old/specific FSharp.Core from nuget.

OK. FWIW we do do some back-testing of using the compiler against previous versions of FSharp.Core and we can always add more.

For example we can compile easy the fsharp.core with a coreclr compiler, and with current compiler, easy (again build matrix).

I think we cover compiling FSharp.Core in multiple ways already just by having the different branches. As far as I see being in different repos would only make that harder right now (e.g. you would need a coreclr branch in both repos, and work between them may need to be coordinated with an endless stream of "use this latest coreclr compiler please" updates).

enricosada commented 8 years ago

@dsyme it's not needed a coreclr branch, only a master (or development depends on workflow) branch of fsharp.core and nuget prerelease packages.

It's a library, you dont need a portable branch.

I understand test all toghether is nice, but you can do that too, both repo locally, and fsharp.core repo use the fsc compiled by the other repository. It's only an env var. Like now the back-testing

I know we already test multiple compilers, it's not the reason for split (i say lot of pro and possibilities to show it's ok to do it)

The real reason it's "easier to contribute", others reason are nice to have

dsyme commented 8 years ago

OK thanks.

Yes, some ways it makes it easier to contribute are:

That said I really don't see this split happening any time soon, it's simply too disruptive and we've bugs to fix and features to implement. But perhaps we can work on DEVGUIDE.md and TESTGUIDE.md to make it much clearer how to develop and test the different subsytems

enricosada commented 8 years ago

Sry @dsyme i am really trying to find something acceptable to do, to make develoment/contribution easier. Your pr about solution reorg is awesome, it's not only bugfix we need, also a curated repo. But if repo is complicated, guides are complicated too.

Ok, physically (whatever that mean with bits) splitting the repo is harder to do.

Maybe we can reorg so, within this repo, add more solutions? And split scripts/tests so we can run parts (like fsharp.core only).

Obv ci and development rules require you to run full suites before send pr, but contribution it's easier. You can develop FSharp.Core.sln and when ready, run all tests. if ok, no additional work. Otherwise, you need to open the big solution.

I think we can split the test suites ( split mean create additional fsprojects and link files )

For example 'tests\fsharp\FSharp.Tests.fsproj` (the cambridge suite) become:

different file lists inside project. No need to change test files, only projects.

FSharp.Tests.FSharpCore.fsproj + FSharp.Tests.Compiler.fsproj = current FSharp.Tests.fsproj

Same for fsharpqa (we can do that now with tags in tests\fsharpqa\Source\test.lst) no need to wait fsharp migration.

So the contributing guide for FSharp.Core is easier.

Build all togheter is nice, but it should be possible to use reference to path directly, instead to projects only.

That's a possible way?

dsyme commented 8 years ago

Ok. For now do these please:

And fully review and update DEVGUIDE.md :)

That will do for now. It's zero churn and makes things much clearer. Don't split the unittests - I don't think it's at all easy to split FSharp.Tests cleanly.

dsyme commented 8 years ago

Closing this old discussion for now