haskell / haddock

Haskell Documentation Tool
www.haskell.org/haddock/
BSD 2-Clause "Simplified" License
361 stars 241 forks source link

LINE pragmas cause missing documentation #441

Open mjakob opened 9 years ago

mjakob commented 9 years ago

Dear Haddock developers,

I ran into a problem with Haddock while using c2hs, but I think this issue can be applicable for any pre-processor using LINE pragmas. I managed to simplify the issue to the following: say we have a source file LinePragmaIssue.chs and we run a pre-processor on it to produce LinePragmaIssue.hs:

{-# LINE 1 "LinePragmaIssue.chs" #-}
-- |Module LinePragmaIssue
module LinePragmaIssue ( B ) where
-- This and the following two lines were generated by the pre-processor:
data A = SomethingLongerThanTheAnnotationForB

{-# LINE 4 "LinePragmaIssue.chs" #-}
-- |Data type B.
data B = B

If Haddock is applied to this file, the output is:

$ haddock LinePragmaIssue.hs
Haddock coverage:
  50% (  1 /  2) in 'LinePragmaIssue'
  Missing documentation for:
    B (LinePragmaIssue.chs:5)

Since the (generated) line data A = SomethingLongerThanTheAnnotationForB has the same line number as the Haddock annotation for B when the LINE pragmas are taken into account and since the declaration is longer, it is sorted after the latter, i.e., the sorted document structure that Haddock uses is:

-- |Data type B.
data A = SomethingLongerThanTheAnnotationForB
data B = B

This results in B being undocumented, which was unexpected, at least to me.

If this is considered a bug, one solution I can think of is to ignore the LINE pragmas while matching comments to declarations, but this might be built into the GHC API and perhaps cause other problems. I couldn't find any related issues in this tracker, but I appologize if I missed them.

Specs according to the guidelines:

Regards, Mattias

Fuuzetsu commented 9 years ago

We don't do any processing of the files ourselves which means either c2hs has to be taught about using LINE better in presence of Haddock comments or GHC to do the thing you expect here. Either way, the ticket should probably go into an issue tracker for one of those projects. Personally I'd file it on c2hs and see what they say. If it is GHC that learns to do this and we need to adapt in how we call it, then it becomes our concern.

Let me know if you disagree and still think it's our ticket.

mjakob commented 9 years ago

My initial assumption was that this was a problem with c2hs, but when I looked into it I became convinced that this was the right place to create the issue. (Actually, my initial assumption was that there was something wrong with my source code, but then I remembered that I don't make mistakes ;))

The GHC User Guide hints that LINE pragmas are mainly for creating better error messages in generated code and hence I don't see anything wrong with the output of c2hs or my simplified example above. If there is a straight-forward way to generate code with LINE pragmas that solves this problem, I can definitely create an issue at c2hs, but right now I wouldn't know what to put in such an issue.

If I managed to identify the function that extracts and matches the declarations and Haddock annotations, it is this one in Haddock.Interface.Create:

-- | The top-level declarations of a module that we care about,
-- ordered by source location, with documentation attached if it exists.
topDecls :: HsGroup Name -> [(LHsDecl Name, [HsDocString])]
topDecls = filterClasses . filterDecls . collectDocs . sortByLoc . ungroup

The sorting appears to be done by:

-- | Sort by source location
sortByLoc :: [Located a] -> [Located a]
sortByLoc = sortBy (comparing getLoc)

which uses getLoc from the GHC API. I can't find any information that this function shouldn't return a LINE pragma aware location, but the documentation is a bit scarce here. If this is the case, however, this should be reported at GHC, but once again, without anything to point at, I don't know what I would put in that issue. I confirmed with some primitive debug output that the source locations returned indeed take into account the LINE pragmas.

Hence, to me, the problem seems to be that Haddock uses the LINE pragma aware source locations for matching declarations with comments, which may not be its intended use. I don't know if it's possible to get LINE pragma naive (I'm making these terms up as I go along!) source locations from GHC, otherwise the solution might be quite complicated.

If I'm mistaking somewhere in my reasoning, please enlighten me and I will stop wasting your time!

l29ah commented 4 years ago

I think i observe the same problem with hsc2hs: when generating documentation from the following piece of (pre-processed) haskell source

{-# LINE 22 "Graphics/Framebuffer/Internal.hsc" #-}

foreign import ccall "ioctl" c'ioctl
  :: Fd -> CInt -> Ptr C'fb_var_screeninfo -> IO CInt
foreign import ccall "&ioctl" p'ioctl
  :: FunPtr (Fd -> CInt -> Ptr C'fb_var_screeninfo -> IO CInt)

{-# LINE 24 "Graphics/Framebuffer/Internal.hsc" #-}

-- | Asks the properties of the supplied framebuffer device
getVarScreenInfo :: Handle -> IO C'fb_var_screeninfo
getVarScreenInfo h = do
    fd <- handleToFd h
    alloca $ \ptr -> c'ioctl fd c'FBIOGET_VSCREENINFO ptr >> peek ptr

Haddock attributes the -- | comment to p'ioctl, while it should have attributed it to getVarScreenInfo instead. Removing the

{-# LINE 24 "Graphics/Framebuffer/Internal.hsc" #-}

line generated by hsc2hs makes haddock attribute the comment correctly. Also, removing empty lines from the .hsc source /sometimes/ makes the comment assigned correctly, and sometimes to another entity, even though in both cases the LINE pragma points at the comment.

The example is on hackage: https://hackage.haskell.org/package/linux-framebuffer-0/docs/Graphics-Framebuffer-Internal.html#v:p-39-ioctl

l29ah commented 4 years ago

Located isn't even Show T_T