Closed DunetsNM closed 5 months ago
Hi @DunetsNM That's really odd, I don't see why by simply switching to net7/8 compile time would increase. Some questions: Do you use SRTP in codecs ? Last time I saw your codebase most codecs were explicit except a few cases. Would it be possible to create a small repro?
@vzarytovskii do you have any idea of what could slow down in net7/8 ?
Yes, static abstracts introduction (both compiler feature and them being in BCL) slows it down significantly. Especially in traits heavy code. It now needs to check so many more things if they satisfy traits.
I see, I knew about that, but I didn't know the slow down was so bad (about 3x).
@DunetsNM if codecs in your codebase don't heavily use SRTPs (implicit codecs), it could be Fleece internals. If so, I can create a branch hardcoding some internal stuff, and therefore removing static resolution in the library. Last time I tried that, the improvement was really marginal, but maybe the situation is different now.
Please create issue over at compiler repo, with build instructions, I will extract profiles for us to see what exactly changed.
Thanks @vzarytovskii
@DunetsNM can you contact me by FSSF Slack so we discuss what's the best approach to follow?
thanks for quick replies guys, I got sick today, will create a smaller repro when I get better / soon.
Do you use SRTP in codecs ? Last time I saw your codebase most codecs were explicit except a few cases.
Correct, nearly everything is explicit
Extra compilation time is pretty obvious even in a simple flat list of types with codecs - although only x2, perhaps it gets worse if there's a lot of nesting?
Here's a dummy example:
For better readability contents of the files:
Types.fs
module Example.Types
[<RequireQualifiedAccess>]
type DummyUnionType1 =
| CaseA of string
| CaseB of string
| CaseC of string
[<RequireQualifiedAccess>]
type DummyUnionType2 =
| CaseA of string
| CaseB of string
| CaseC of string
[<RequireQualifiedAccess>]
type DummyUnionType3 =
| CaseA of string
| CaseB of string
| CaseC of string
...
[<RequireQualifiedAccess>]
type DummyUnionType10 =
| CaseA of string
| CaseB of string
| CaseC of string
type DummyRecordType = {
X1: DummyUnionType1
X2: DummyUnionType2
X3: DummyUnionType3
X4: DummyUnionType4
X5: DummyUnionType5
X6: DummyUnionType6
X7: DummyUnionType7
X8: DummyUnionType8
X9: DummyUnionType9
X10: DummyUnionType10
}
open Fleece
open System.Reflection
open Microsoft.FSharp.Reflection
open FSharpPlus.Data
open FSharpPlus
let inline mergeUnionCases (fn: 'Union -> '``Codec/Decoder<PropertyList<'Encoding>, 'Union>``) : '``Codec/Decoder<PropertyList<'Encoding>, 'Union>`` =
FSharpType.GetUnionCases (typeof<'Union>, BindingFlags.NonPublic ||| BindingFlags.Public)
|> NonEmptySeq.ofArray
|> NonEmptySeq.map (fun c -> FSharpValue.MakeUnion (c, Array.zeroCreate (Array.length (c.GetFields ())), BindingFlags.NonPublic ||| BindingFlags.Public) :?> 'Union)
|> NonEmptySeq.map fn
|> choice
type DummyUnionType1 with
static member get_Codec () =
function
| CaseA _ -> CaseA <!> jreqWith Codecs.string "CaseA" (function (CaseA x) -> Some x | _ -> None)
| CaseB _ -> CaseB <!> jreqWith Codecs.string "CaseB" (function (CaseB x) -> Some x | _ -> None)
| CaseC _ -> CaseC <!> jreqWith Codecs.string "CaseC" (function (CaseC x) -> Some x | _ -> None)
|> mergeUnionCases
|> ofObjCodec
type DummyUnionType2 with
static member get_Codec () =
function
| CaseA _ -> CaseA <!> jreqWith Codecs.string "CaseA" (function (CaseA x) -> Some x | _ -> None)
| CaseB _ -> CaseB <!> jreqWith Codecs.string "CaseB" (function (CaseB x) -> Some x | _ -> None)
| CaseC _ -> CaseC <!> jreqWith Codecs.string "CaseC" (function (CaseC x) -> Some x | _ -> None)
|> mergeUnionCases
|> ofObjCodec
...
type DummyUnionType10 with
static member get_Codec () =
function
| CaseA _ -> CaseA <!> jreqWith Codecs.string "CaseA" (function (CaseA x) -> Some x | _ -> None)
| CaseB _ -> CaseB <!> jreqWith Codecs.string "CaseB" (function (CaseB x) -> Some x | _ -> None)
| CaseC _ -> CaseC <!> jreqWith Codecs.string "CaseC" (function (CaseC x) -> Some x | _ -> None)
|> mergeUnionCases
|> ofObjCodec
type DummyRecordType with
static member get_Codec () =
codec {
let! x1 = jreqWith (DummyUnionType1.get_Codec()) "X1" (fun x -> Some x.X1)
and! x2 = jreqWith (DummyUnionType2.get_Codec()) "X2" (fun x -> Some x.X2)
and! x3 = jreqWith (DummyUnionType3.get_Codec()) "X3" (fun x -> Some x.X3)
and! x4 = jreqWith (DummyUnionType4.get_Codec()) "X4" (fun x -> Some x.X4)
and! x5 = jreqWith (DummyUnionType5.get_Codec()) "X5" (fun x -> Some x.X5)
and! x6 = jreqWith (DummyUnionType6.get_Codec()) "X6" (fun x -> Some x.X6)
and! x7 = jreqWith (DummyUnionType7.get_Codec()) "X7" (fun x -> Some x.X7)
and! x8 = jreqWith (DummyUnionType8.get_Codec()) "X8" (fun x -> Some x.X8)
and! x9 = jreqWith (DummyUnionType9.get_Codec()) "X9" (fun x -> Some x.X9)
and! x10 = jreqWith (DummyUnionType10.get_Codec()) "X10" (fun x -> Some x.X10)
return {
X1 = x1
X2 = x2
X3 = x3
X4 = x4
X5 = x5
X6 = x6
X7 = x7
X8 = x8
X9 = x9
X10 = x10
}
}
|> ofObjCodec
Net6Client.fsproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net6.0</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Fleece.SystemTextJson" Version="0.10.0" />
</ItemGroup>
<ItemGroup>
<Compile Include="Types.fs" />
</ItemGroup>
</Project>
Net7Cilent.fsproj (one line diff)
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net7.0</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Fleece.SystemTextJson" Version="0.10.0" />
</ItemGroup>
<ItemGroup>
<Compile Include="Types.fs" />
</ItemGroup>
</Project>
global.json (SDK 7.0.300 performs similar)
{
"sdk": {
"version": "8.0.300",
"rollForward": "latestMinor"
}
}
If you run
dotnet build Net6Client.fsproj dotnet build Net7Client.fsproj
it takes ~3.5 sec for net6 vs ~7 sec for net7 on my machine, YMMV This example uses only explicit codecs passing, but implicit codecs didn't make any difference here (perhaps because of too simple / flat hierarchy).
Also if you try to target Fleece.fsproj
itself to net7.0 you will see proportional slowdown in compilation time.
So, if any SRTP-heavy code is affected, is there anything that can be done about it / apart from pinning to net6 ? If, in theory, we make a fork of Fleece that uses IWSAM as a main vehicle for compile-time codec resolution, will it make the library and client code any faster to compile?
I mean something like this:
type IHaveDefaultCodec<'ImplType
when 'ImplType :> IHaveDefaultCodec<'ImplType>> =
static abstract member Codec<'Encoding when 'Encoding :> Fleece.IEncoding and 'Encoding : (new: unit -> 'Encoding)> : unit -> Fleece.Codec<'Encoding, 'ImplType>
type OrderDirection =
| Ascending
| Descending
with
static member get_Codec () =
function
| Ascending -> konst Ascending <!> jreqWith Codecs.unit "Ascending" (function Ascending -> Some () | _ -> None)
| Descending -> konst Descending <!> jreqWith Codecs.unit "Descending" (function Descending -> Some () | _ -> None)
|> mergeUnionCases
|> ofObjCodec
interface IHaveDefaultCodec<OrderDirection> with
static member Codec() = OrderDirection.get_Codec ()
I guess it depends on where the bottleneck is. Is it emitting of the inlined code (which can't be avoided in case of SRTP , but not required for IWSAM), or computationally-heavy type resolution.
@DunetsNM hope you feel better today. Thanks for the repro, I will have a look in order to find out where's the bottleneck on the code side.
Regarding IWSAMs, there's already #132 have you tried it? And there's even a nuget for it.
I see now, in your last code snippet you mean using IWSAMs for implicit codecs, one important limitation with that approach is retrofitting primitive types. That's not possible with IWSAMs unless you control the source code.
@gusty yes I've played with #132 before (didn't switch because our code still requires AdHocEncoding), just double check that compilation speed difference is about the same. But it uses IWSAM only for IEncoding
which is relatively small part of Fleece.
Also, you don't really need my examples, the Fleece library itself is affected:
FleeceNet6.fsproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net6.0</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
<Compile Include="Fleece.fs" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="FSharpPlus" Version="1.2.4" />
</ItemGroup>
</Project>
FleeceNet7.fsproj
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFrameworks>net7.0</TargetFrameworks>
</PropertyGroup>
<ItemGroup>
<Compile Include="Fleece.fs" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="FSharpPlus" Version="1.2.4" />
</ItemGroup>
</Project>
Here dotnet build FleeceNet6.fsproj
and dotnet build Fleece.Net7.fsproj
take ~11 sec and ~25 sec respectively for me.
Let me try the same with FSharpPlus
which Fleece uses a lot ...
So for current FSharpPlus
master build time
net6.0 1:46 net7.0 2:43 net8.0: 2:47
@vzarytovskii I had no luck reproducing slowdown of SRTP / inline with dummy vanilla code, it's only Fleece and/or FSharpPlus projects where slowdown in build time is obvious. Dissecting it further to find the bottleneck is not an obvious task to me. Should that still be reported to F# compiler repo, or it rather belongs to those projects (as they take usage of SRTP to its own extremes / not your typical F# compiler usage) ?
Just in case here's my dumb inline-heavy SRTP code that compiles in about the same time for net6, net7 and net8 (~7 seconds in all cases)
Contents of the source file:
module Example
type Record =
{ Number: int }
static member Zero() = { Number = 0 }
let inline fmap1 (f: ^a -> ^b) (a: ^a list) =
printfn "fmap1"
List.map f a
let inline fmap2 f a =
printfn "fmap2"
fmap1 f a
let inline fmap3 f a =
printfn "fmap3"
fmap2 f a
...
let inline fmap99 f a =
printfn "fmap99"
fmap98 f a
let inline fmap100 f a =
printfn "fmap100"
fmap99 f a
let main () =
fmap1 (fun x -> System.Int32.Parse(x)) ["1";"2";"3";"4"] |> ignore
fmap1 (fun x -> System.Int32.Parse(x)) ["1";"2";"3";"4"] |> ignore
fmap2 (fun x -> x.ToString()) [1;2;3;4] |> ignore
fmap2 (fun x -> System.Int32.Parse(x)) ["1";"2";"3";"4"] |> ignore
...
fmap100 (fun x -> x.ToString()) [1;2;3;4] |> ignore
fmap100 (fun x -> System.Int32.Parse(x)) ["1";"2";"3";"4"] |> ignore
0
Sure, still report it, but with project examples.
@DunetsNM thanks for all the repros. On my side I can confirm that definitely heavy use of SRTPs becomes almost 2x slower or more as from .net7
We can help @vzarytovskii to investigate and hopefully fix whatever is causing that bottleneck in the compiler but as I pointed at the beginning I am a bit surprised since I know your codebase and AFAIK it doesn't rely that much on heavy SRTP.
However after seeing your repro I found that there is a small detail that causes an unnecessary level of genericity, but good news, it's fixable.
The problem is this line:
open FSharpPlus
Well, not that line alone, but the issue is that codecs use the <!>
operator, but F#+ does an autoopen of generic operatos when you open the namespace FSharpPlus
.
This is something we're planning to change in v2 of the library, to avoid precisely situations like this. It is still not clear what would be the new namespace design (suggestions are welcome).
So, it's a bit tricky right now, since the code in the end routes to the same implementation but it takes longer to compile, so you don't notice that something breaks or so. But making sure FSharpPlus is not open at the codec definition should fix the long compilation issue.
If just a few functions are needed from FSharpPlus you can simply fully qualify it, for instance in your repro you can do:
let inline mergeUnionCases (fn: 'Union -> '``Codec/Decoder<PropertyList<'Encoding>, 'Union>``) : '``Codec/Decoder<PropertyList<'Encoding>, 'Union>`` =
FSharpType.GetUnionCases (typeof<'Union>, BindingFlags.NonPublic ||| BindingFlags.Public)
|> NonEmptySeq.ofArray
|> NonEmptySeq.map (fun c -> FSharpValue.MakeUnion (c, Array.zeroCreate (Array.length (c.GetFields ())), BindingFlags.NonPublic ||| BindingFlags.Public) :?> 'Union)
|> NonEmptySeq.map fn
|> FSharpPlus.Operators.choice
... or create a small namespace in your project aliasing these functions.
Another alternative could be doing exactly the opposite, I mean, keep opening FSharpPlus but open later another namespace which shadows the <!>
operator.
As said, we'll fix this in v2 of F#+ as it will be focusing more on less generic code by default. Most likely you'll have to explicitly open a "generic operators" namespace.
The fact that the change to target framework alone changes it, leads me to believe that change in the interface hierarchy causes it
@vzarytovskii I'll try to isolate the compiler issue so you have a smaller repro, but it doesn't seem to be easy. Still I'll give it a try.
@DunetsNM I had a typo in my previous response, I meant the <!>
operator. The operator <|>
could also be affected but it's not in your repro code. I edited my answer.
leads me to believe that change in the interface hierarchy causes it
@dsyme just made this commit to the IWSAM RFC (https://github.com/fsharp/fslang-design/commit/71cd2e04bc181264a6a68c731ff018369848d1f9).
For the specific case of IWSAMs, the presence of enormous generic interface lists can cause compiler slow-down. No one ever expected such lists of interfaces in .NET, this is a volcano of hidden complexity lying under the simple type
double
ordecimal
. [3/6/2024: as prophesized, an example of how this feature led to this in real life can be seen here]
This suggests that @vzarytovskii's comment is probably spot on indeed.
This suggests that @vzarytovskii's comment is probably spot on indeed.
My comment was based on Vlad's analysis, and doesn't add new information.
Oh, oops 😆
@DunetsNM Could you confirm whether my suggestion fixed the issue or not?
@gusty somehow missed that, will try shortly. I suppose you mean these suggestions?
making sure FSharpPlus is not open at the codec definition should fix the long compilation issue.
and
Another alternative could be doing exactly the opposite, I mean, keep opening FSharpPlus but open later another namespace which shadows the <!> operator.
@gusty it helped !
The only change I had to make in my slower_net7.zip example was to remove open FSharpPlus
and qualify choice
to Fleece.Operators.choice
Now both net6 and net7 take ~3.5 sec to compile.
And in my prod project the problem was literally just in one open FSharpPlus
which was used only for three occurrences of tryParse
. Compilation time dropped instantly from ~2:30 to just ~00:30.
What else is interesting, this Scratch that, I think it was just bigger number of open FSharpPlus
occurred very early in the compilation order in the module with some reusable types. If I declare a similar type in the module which comes last in compilation order then compilation speed deteriorates "only" to 1:40 seconds (still very bad).<!>
and <*
usages in first file over the other one (see next comment).
I'm not sure if we can single out FSharpPlus
as the main culprit just yet. My project still has plenty of open FSharpPlus
in other source files that don't use codecs and those still compile fast. Somehow it's a combination of both Fleece
usages and open FSharpPlus
in the same module. I need to try to to make as tiny module as possible that still compiles ridiculously slow.
It's also interesting whether compiler can help to catch such problem early and emit warning if some symbol resolution takes a very very long time.
Well, not that line alone, but the issue is that codecs use the <!> operator, but F#+ does an autoopen of generic operatos when you open the namespace FSharpPlus.
This is something we're planning to https://github.com/fsprojects/FSharpPlus/issues/288 of the library, to avoid precisely situations like this. It is still not clear what would be the new namespace design (suggestions are welcome).
Ok besides the <!>
operator my code also uses <*
operator, they are both present both in FSharpPlus
and in Fleece
. Both are responsible. To bring back compilation speed to normal it's enough to define unambiguous operator such as let (<!!!>) f field = Fleece.Codec.map f field
and use it instead of <!>
(I was unable to quickly provide uniquely named alternative for Fleece's <*
so just commented out those few usages), I can still keep open FSharpPlus
So it's definitely the naming clash between the two libraries, I can only guess what heavy lifting compiler performs while deciding to inline one operator over another. Somehow my hunch is that it's still the Fleece issue, and an operator name collision between FSharpPlus and a simpler library would not affect compilation time as much.
Maybe we should simply give Fleece operators unique names, or provide plain function alternatives?
We can summarize this issue as a combination of 4 things:
1 - F# ALWAYS treats let bound operators with higher priority than static member operators
2 - open FSharpPlus
opens generic operators
3 - Fleece has a less generic definition of those same operators
4 - SRTP code degraded as from .net7
It is enough to remove one of the 4 facts and the issue is gone, but this is like aircraft accidents: not a single but a chain of events together which cause the issue.
@DunetsNM I think for a "medium term" solution, (see below) changing Fleece operators is not that great, it is good that they match, so it reflects they are effectively the same (only less generic), moreover many types in FSharpPlus.Data use the same approach: operators defined at the type which are the same as the generic ones, but of course less generics.
I can think of 3 actions:
Short term solution: As I suggested, maybe the last option is better than fully qualifying, what I mean you can keep open FSharpPlus
where you have it now and define somewhere something like:
module FleeceCodecOperators =
open Fleece
let (<!>) x y = (<!>) x y
let (<|>) x y = (<|>) x y
let (<*>) x y = (<*>) x y
let (*>) x y = (*>) x y
... and so on
then right before defining your codecs you open FleeceCodecOperators
, then the generic operators will be shadowed.
Medium term solution: Finish and release FSharpPlus v2, without autoopening operators. This would solve this issue, but it's not that great for other uses of F#+, because you will have to do something like open FSharpPlus.Operators.Generic
but what if I want just a portion of the generic operators and not the rest?
Long term solution: Create a suggestion, RFC and implement something (maybe an attribute?) in the F# compiler which allows to define let-bound operators with lower priority than static members, as I can see this problem will affect code outside FSharpPlus in the long run. @vzarytovskii , @dsyme what do you think?
Besides the clash of <!>
and <*
operators between Fleece
and FSharpPlus
the implicit codec resolution is also very expensive in net7 when nothing is known about the generic type at compile time i.e. jreq
or equivalent jreqWith Fleece.Operators.defaultOf<_, 't>
when 't
is a truly generic type parameter where argument type can be anything (a primitive, unit
, a tuple of other types, or a custom F# type with suitable static member Codec
) can cost couple of seconds of extra compilation time for every invocation (!).
When 't
is reasonably constrained (e.g. it implements some interface, which statically eliminates tuples and primitives etc.) then defaultCodec<_, 't>
compiles fast - just like for a non-generic type.
Fortunately there's not so many generic types with codecs where truly unconstrained type parameter is required, however even when it's not required based on usages sometimes it's simply not enforced in type signature which makes defaultCodec
a potential footgun.
To work around that I have a custom function that is less universal than defaultCodec
(works only for types with static member Codec) but at least reliably compiles fast:
let inline codecFor<'Encoding, 'T when 'T : (static member Codec: Codec< 'Encoding, 'T>) > =
(^T : (static member Codec: Codec< 'Encoding, ^T>) ())
I guess that would still need generic resolution for primitive types inside T
.
Here there's no way around, so far the only mechanism available in F# to retrofit an interface-like is doing this kind of complex SRTP overloads, at least until FS-1043 is finished and released.
BTW: Can we close this issue, since the problem was identified and a solution (actually more than one) proposed?
We should continue this in FSharpPlus v2 and also open a new F# compiler issue (and maybe a suggestion for the operators scope), linking this one.
yes I guess we can close it with "workaround exists"
Yes, we can conclude the main issue is the F# compiler degradation, you were lucky you were not relaying in generic code, only by accident.
I have a project with a lot of complex Fleece codecs.
Codecs have never been too fast to compile (due to heavy usage of SRTP) but after I tried to upgrade
<TargetFramework>
value fromnet6.0
tonet7.0
ornet8.0
it got worse.The project used to take 35 to 45 seconds to compile, with net7.0 it deteriorated to circa 2:30, similar number for net8.0
It doesn't look like it's related to some specific codecs, just sheer amount of them. The more I delete/comment out the faster compilation becomes (dropping to just 15 seconds when all codecs are removed - same number for any target, net6/net7/net8 i.e. slowness is 100% codec-related).
There were many changed parts between net6 and net7/net8 versions of my projects (e.g. different versions of FSharp.Core and FSharpPlus, newer SDK) but apparently the single cause of compile time degradation is the TargetFramework value: if I change it back to
net6.0
while keeping all other changes, compilation time improves / goes back to normal.I tried to fork
Fleece
, added net7.0 and net8.0 targets to it and consumed from my project. It did not help.The main suspect is SRTP and inlining which is heavily used in codecs. Did something change in newer F# versions that made it so much slower to compile on average?