Closed angerman closed 6 years ago
@angerman Awesome, thank you very much for recording your thoughts here.
@snowleopard Would you be okay with defining simple parsers for this using parsec
, given that it's now a dependency of Cabal
and is therefore already a (transitive) dependency ?
@alpmestan I guess there is no way back for us in terms of dependency on Cabal, so yes -- I see no reasons to avoid using Distribution.Parsec
.
One slight issue is that Cabal's codebase seems to be quite turbulent: Hadrian is often broken by changes in Cabal, and Distribution.Parsec
might end up being one more way to accidentally break Hadrian.
@snowleopard I was actually just speaking about parsec
itself, not Cabal's parsers. We don't need something terribly fancy, and we do have to map things to our own types anyway (Way
, Stage
, Package
, Context
and so on). So we could just define a few parsec parsers that extract bits and pieces from filepaths, not really exposing ourselves much to breakage.
Ah, I see. Well, depending on one more GHC library seems like an unnecessary high cost, both complexity wise and compilation time wise -- there are two dozen extra modules to compile!
If all we want is to parse ${root}/${stage}/lib/${pkgId}/libHS${pkgId}${waySuffix}.a
, then splitPath and stripPrefix are sufficient and will do the job at much lower cost.
Well, we already depend on Cabal, so we already build parsec whenever we build hadrian now. If you however still would like to avoid the direct dependency on parec, I'll do without it, using something like what you suggest.
Well, we already depend on Cabal, so we already build parsec whenever we build hadrian now
@alpmestan Ah, indeed -- that's a good point. So, you are saying that the cost of using parsec
is actually zero? We already transitively depend on it, and we already build it. In this case, let's give parsec
a try 👍
@snowleopard Exactly, the cost is zero, but the usability/readability of the code will be greater IMO. I'm preparing a proof-of-concept patch with parsec that pretty much parses the Context
out of filepaths, at least just for Rules.Library
for now.
@alpmestan Thanks, sounds good! Sorry for being slow today :)
I know the feeling all too well myself, don't worry :-)
@snowleopard @angerman I have a WIP branch at https://github.com/snowleopard/hadrian/compare/master...alpmestan:generic-library-rules that implements Moritz's suggestion. I still seem to be able to build a GHC just fine if I run the ./boot && ./configure
step myself explicitly, but not if I start from a clean source tree and run hadrian with -c
. The error is:
[hadrian/cfg/system.cfg, settings, mk/config.h]
about to need ["hadrian/cfg/system.config.in","settings.in","mk/config.h.in"]
about to run ./configure
in doWith
builder ready
shakeArgsWith 0.000s 0%
Function shake 0.272s 65% =========================
Database read 0.001s 0%
With database 0.000s 0%
Running rules 0.140s 33% ============
Pool finished (14 threads, 4 max) 0.000s 0%
Total 0.414s 100%
Build system error - indirect recursion detected:
Key value 1: OracleQ (KeyValue ("hadrian/cfg/system.config","host-os"))
Key value 2: hadrian/cfg/system.config
Key value 3: hadrian/cfg/system.config settings mk/config.h
Key value 4: OracleQ (KeyValue ("hadrian/cfg/system.config","project-version"))
Rules may not be recursive
(the first few lines come from debugging output I added -- see my branch)
Do any of you have any idea of where that cycle comes from and why my patch makes this happen? I won't have a lot of time to look into this for the end of this week, just a little bit, so any help would be appreciated =)
Note: I can already confirm we're not paying the boot-up cost anymore, since we're not reading cabal files just to generate the lib rules anymore.
@alpmestan regarding the -c
failure above, I noticed this when I was debugging the Circle CI error, and it appears even in the selftest
target. I don't know much else though...
Yeah, I was hoping my patch would fix it but I'm missing something obviously. If anyone's got some spare time to help me debug/fix this, that'd be lovely!
My patch got submitted and merged in #571. Shall we close this @angerman?
Sounds good!
As noted in this comment, we currently generate specific rules for all known libraries. We would be better off to just build generic rules and then parse the details from the path.
That resolves on the one hand the need to know everything before hand to generate all specific rules, and I would expect this to also speedup the startup process of hadrian.
So what do we need? We'll need functions to extract at least:
from a path.
Example:
right now we generate
where
root
is aFilePath
computed frombuildRootRules
, and we also have:As such we basically want to match something like:
And use a pattern like
here. And then extract the details from
a
andfail
if it's not a match.Additional notes
maybe we want to reuse some URL-matcher/router from one of the web-frameworks to extract the details easily and not write the logic by hand, while using an already performance tuned library?