Is the `CabalParsing` class necessary?

FinleyMcIlwaine commented 5 months ago

Describe the feature request The usage of the CabalParsing class introduces a lot of overloaded calls to the .cabal file parser. Removing the class, and specializing references to it to ParsecParser (which, as far as I can tell, is the only instance of CabalParsing ever written), results in a 30% improvement in parser times and a 20% reduction in parser allocations. I have opened this PR with this change.

Additional context I am not very familiar with how the Cabal-syntax package is consumed, so the change above may not be desirable or possible. In any case, I thought I would open the PR and ask the question here, since this could be a nice performance win.

phadej commented 5 months ago

Yes.

This is a MAJOR breaking change.

If you do this, you could rip-off whole CharParsing machinery, which exists so people who don't like parsec don't need to use parsec (e.g. they could use megaparsec to get better error messages - that requires work, but currently it's possible).

As an author of CharParsing I think this change, while has performance benefits, is not acceptable.

tbidne commented 5 months ago

If the problem is specialization, is it possible that INLINEABLE may yield performance wins? I.e. adding {-# INLINEABLE foo #-} to every function that with a signature like foo :: CabalParsing m => m Int (I realize there are many).

phadej commented 5 months ago

This patch demonstrates how you could begin adding SPECIALIZE pragmas to remove some of the overloading. However, the performance improvements aren't as good ... and the specialization is somewhat fragile, since in some cases we need to delay inlining to ensure the rewrite rules can fire before worker wrapper.

Then fix GHC.

What you do in this patch is manual optimization. If specialization is fragile, that's a GHC bug. If specialization and worker wrapper phase each other, that's also a GHC bug.

Many have asked for 100% guaranteed monomorphisable abstraction (like C++ templates and Rust traits). GHC should be able to deliver. In C++ and Rust you can write code at higher abstraction level (i.e. against an interface) with guaranteed optimisation behaviour. GHC should be able to do that too.

IMO 55 milliseconds per file is acceptable absolute value (IIRC this value is also skewed by few outilers, like huge files like acme-everything; it would be better to compare medians). If there is an abstraction cost, it's fine. Hopefully GHC will do better in the future.

phadej commented 5 months ago

Recall, Generic LicenseId was also removed because GHC is "too slow" to compile it. And no-one is looking at making GHC.Generics viable for large types, partly because the problem is swept under the rug.

Now, you propose to sweep out CabalParsing, so specialization issues aren't right there for GHC developers to work on.

I don't like this trend.

FinleyMcIlwaine commented 5 months ago

Closing following the discussion in #10081.

haskell / cabal

Is the `CabalParsing` class necessary? #10080