Open harpocrates opened 6 years ago
I think we will want to fix this by explicitly serializing location information for these "special" Name
s. I've tried this out in https://github.com/harpocrates/ghc/commit/07f6785525d0bdfbd1b5350a89080db5e0ded111 and it definitely helps preserve locations for Haddock. The problem is that this is going to change error messages / GHCi output.
Before:
Prelude> instance Eq Int where {}
<interactive>:1:10: error:
Duplicate instance declarations:
instance Eq Int -- Defined at <interactive>:1:10
instance Eq Int -- Defined in ‘GHC.Classes’
Prelude>
After:
Prelude> instance Eq Int where {}
<interactive>:1:10: error:
Duplicate instance declarations:
instance Eq Int -- Defined at <interactive>:1:10
instance Eq Int -- Defined at GHC/Classes.hs:256:10
Prelude>
Maybe the fix is just going to be to patch pprDefinedAt
. Nothing I've tried has really left me satisfied though...
Actually, it looks like we might need even more SrcSpan
s: the Missing documentation for:
warnings are also suffering from missing locations.
Thanks for making a proper issue!
What do we use the SrcSpan in DocInstance for? If it's for some heuristic, maybe we can encode the necessary information in some other way?
Actually, it looks like we might need even more SrcSpans: the Missing documentation for: warnings are also suffering from missing locations.
Right. I had recorded that issue in https://github.com/sjakobi/haddock/issues/4. I don't think this is too grave though. We can still output in which module the docs are missing.
@alexbiehl's idea was to let only the extended .hie
files encode locations. We can probably make it so that when you use --hyperlinked-source
, warnings are output with source locations.
What do we use the SrcSpan in DocInstance for? If it's for some heuristic, maybe we can encode the necessary information in some other way?
The SrcSpan
is the key for InstMap
. The SrcSpan
uniquely identifies the instance. For this case at least, it seems like serializing instance location information in .hi
files is the easiest and fastest approach.
For pretty much everything else, putting locations in .hie
files sounds like a great idea.
Would you mind putting https://github.com/harpocrates/ghc/commit/07f6785525d0bdfbd1b5350a89080db5e0ded111 on Phabricator, separately from D5047, Alec?
It would cause so much churn in the testsuites that I'd rather not mix it with the rest of the Hi Haddock changes.
I'm gonna try fiddling around with pprDefinedAt
a bit more first, but then I'll put it up.
I'm sorry I've let this lag for so long. I've just got a hunch that perhaps this is all avoidable. I just need to find a couple hours to load back into memory the end-to-end story for why we need SrcSpan
to convince myself this really is necessary (or find an alternate path).
Do holler if this is blocking.
I finally got several consecutive hours to think this through properly. Apologies for the block of text: I want to avoid forgetting and re-discovering this again later.
Unlike other declarations, Haddock never looks at InstD
(or DerivD
). Instead, it queries GHC to get complete instance lists (via attachInstances
, md_fam_insts
, and md_insts
/modInfoInstances
) in the form of [FamInst]
and [ClsInst]
.
GHC's FamInst
and ClsInst
conveniently have unique names (that look like $fEqBool
), which gives Haddock names under which to key docs attached to instances. What is missing is a way to match up InstD
/DerivD
’s and FamInst
/ClsInst
's so that we can get docs (which are matched up to InstD
/DerivD
's) associated with the right FamInst
/ClsInst
.
The invariant around SrcSpan
(see Note [Haddock assumptions]
in GHC) is a way around that: it is something that FamInst
/ClsInst
promise to uphold so that they can be easily matched up with the InstD
/DerivD
’s from which they came.
FamInst
/ClsInst
to serialize location info (for Hi Haddock)We don’t need ClsInst
/FamInst
to actually serialize the location info - we just need the invariant to hold for ClsInst
/FamInst
from the module we are currently compiling.
We can get rid of the instanceMap
on the Haddock side completely (and Haddock-side calls to getInstLoc
in the process) - it is used only in mkVisibleNames
and I’m 90% sure that code path is unreachable. (I tried replacing instanceMap
with error “UNEEDED”
and nothing crashed - more testing/reasoning is needed here).
All this to say: my proposed GHC change from https://github.com/haskell/haddock/issues/921#issuecomment-416278563 is missing the point.
FamInst
/ClsInst
though...The code in the XHTML backend uses some locations to generate links to source pages which may or may not end up being generated by the hyperlinker backend (so we shouldn't count on .hie
files to save us).
I propose that we add one extra field into the GHC-side data Docs
type: a decl_locs :: Map Name RealSrcSpan
for putting the locations of (at least)
I think this will solve both the initial bug for which I opened this issue and the open TODO about splices.
How does all this sound?
This sounds good to me even though my understanding of the instance handling is rather fuzzy.
In fact we already used to serialize this data before we (mistakenly) decided that we'd only the location info from .hie
-files. On the GHC side I found this commit that added a bit more info than we now need.
EDIT: For the Haddock side I found https://github.com/sjakobi/haddock/commit/d3de1a5416b9251590d7d168622881b3064b9fe2. https://github.com/sjakobi/haddock/commit/811a57 might also be interesting.
If there is already a '.hi' file present for a module we are haddock-ing, none of the instances in that module will have any location information.
I initially pointed this out in https://github.com/haskell/haddock/pull/906#issuecomment-414787805. I'm making a full blown issue for this because it is turning out to be a bit nastier than I expected.
The third field of
DocInstance
(theLocated (IdP name)
) is where we have a disappearing location:https://github.com/haskell/haddock/blob/133e9c2c168db19c1135479f7ab144c4e33af2a4/haddock-api/src/Haddock/Types.hs#L392
This is supposed to glean its location off of
dfun_id
(orco_ax_name
for type families, although that path is not well documented). There is aNote [Haddock assumptions]
referenced from that field:However, when serializing
Name
s in interface files, GHC discards all locations. When reading the interface back out, the location holes get filled withnoSrcSpan
. All that to say that this note no longer holds if the class instance was read from an interface file.