ndmitchell / shake

Shake build system
http://shakebuild.com
Other
772 stars 118 forks source link

Why does my rule behave different from an oracle? #241

Open nh2 opened 9 years ago

nh2 commented 9 years ago

I have the following, with question at the end:

{-# LANGUAGE DeriveDataTypeable, GeneralizedNewtypeDeriving, MultiParamTypeClasses #-}
{-# OPTIONS_GHC -Wall #-}

import           Control.Applicative
import           Development.Shake
import           Development.Shake.Classes
import           Development.Shake.FilePath
import           Development.Shake.Rule
import           System.Directory (createDirectoryIfMissing)
import           System.Process

newtype GetNonHiddenFilesQ = GetNonHiddenFilesQ (FilePath, String) deriving (Show, Typeable, Eq, Hashable, Binary, NFData)
newtype GetNonHiddenFilesA = GetNonHiddenFilesA [FilePath] deriving (Show, Typeable, Eq, Hashable, Binary, NFData)

instance Rule GetNonHiddenFilesQ GetNonHiddenFilesA where
  -- storedValue _ (GetNonHiddenFilesQ (topDir, findPattern)) = Just . GetNonHiddenFilesA <$> findNonHiddenFilesIO topDir findPattern
  storedValue _ _ = return Nothing

getNonHiddenFilesRule :: Rules ()
getNonHiddenFilesRule = rule $ \(GetNonHiddenFilesQ (topDir, findPattern)) ->
  Just (GetNonHiddenFilesA <$> liftIO (findNonHiddenFilesIO topDir findPattern))

getNonHiddenFiles :: FilePath -> String -> Action [FilePath]
getNonHiddenFiles topDir findPattern = do
  GetNonHiddenFilesA files <- apply1 $ GetNonHiddenFilesQ (topDir, findPattern)
  return files

-- | Similar to `getDirectoryFiles`, but does not include hidden files
-- (starting with dot) or files in hidden directories, and does not
-- call `need`. Uses @find@ internally.
findNonHiddenFilesIO
  :: FilePath -- ^ dir fom which to search ("." or "" for current dir)
  -> String   -- ^ pattern to pass to @find@ (used after @-name@)
  -> IO [FilePath]
findNonHiddenFilesIO topDir findPattern = do
  let pat = if findPattern == "" then "." else findPattern
  out <- readProcess
    "find"
    [ topDir, "-type", "d", "-path", "*/\\.*"
    , "-prune", "-o"
    , "-not", "-name", ".*", "-type", "f", "-path", pat, "-print"
    ]
    ""
  return (map normalise $ lines out)

-- For oracle.
newtype GetNonHiddenFiles = GetNonHiddenFiles (FilePath, String) deriving (Show, Typeable, Eq, Hashable, Binary, NFData)

main :: IO ()
main = do
  createDirectoryIfMissing True "oracle"
  createDirectoryIfMissing True "rule"

  shakeArgs shakeOptions
    { shakeLint = Just LintBasic
    } $ do

    getNonHiddenFilesRule

    -- Oracle for `getNonHiddenFiles`.
    askNonHiddenFilesOracle <- addOracle $ \(GetNonHiddenFiles (topDir, findPattern)) -> do
      liftIO $ findNonHiddenFilesIO topDir findPattern

    -- Similar to `getDirectoryFiles`, but ignores hidden files/dirs.
    -- Uses @find@ inside.
    let getNonHiddenFilesO :: FilePath -> String -> Action [FilePath]
        getNonHiddenFilesO topDir findPattern = do
          askNonHiddenFilesOracle $ GetNonHiddenFiles (topDir, findPattern)

    -- Running prints:
    --   Build completed in 0:01m
    phony "testOracle" $ do
      _ <- getNonHiddenFilesO "oracle" "*"
      writeFile' "oracle/myfile" "x"
      _ <- getNonHiddenFilesO "oracle" "*"
      return ()

    -- Running prints:
    --   Lint checking error - value has changed since being depended upon:
    --     Key:  GetNonHiddenFilesQ ("rule","*")
    --     Old:  GetNonHiddenFilesA []
    --     New:  <missing>
    phony "testRule" $ do
      _ <- getNonHiddenFiles "rule" "*"
      writeFile' "rule/myfile" "x"
      _ <- getNonHiddenFiles "rule" "*"
      return ()

Why is that?

It seems to me that what I do with GetNonHiddenFilesQ/A is exactly what Development.Shake.Rules.Oracle does (having storedValue _ _ = return Nothing).

What am I missing?

ndmitchell commented 9 years ago

Does the testRule phony have a problem if you skip the writeFile'? My guess is that is a red herring. I suspect the key line you are missing is https://github.com/ndmitchell/shake/blob/master/src/Development/Shake/Special.hs#L13 - there is a hack to say that oracles always rebuild, and that is fine. If that turns out to be the issue, I can add an extra method to Rule that returns whether a rule should be ignored for lint checking.

In general, if you depend on a value, and then something else in the build causes it to change value, that would be bad. In this case, I'd also suggest push the find call into storedValue, and not returning Nothing, which should fix it as well. Was there a reason for wanting to pull it out and do it later?

nh2 commented 9 years ago

In general, if you depend on a value, and then something else in the build causes it to change value, that would be bad.

I probably wasn't very clear: I precisely desire to get the Lint checking error - value has changed since being depended upon error.

When I switched from an Oracle to my own rule, it suddenly got me those lint errors that I hoped to see, because my build system was wrong.

Your link to Special indeed explains what the difference is.

Now the next question would be: Oracles are much less verbose than custom rules. Would it be possible to have some flavour of oracles that don't always rebuild, or otherwise achieve a conciseness similar to that of oracles for my problem?

ndmitchell commented 9 years ago

Oracles are a bit weird, they seem to be conflating many concepts (caching, always rebuilding, type-directed rules) into one. I do suspect that splitting the pieces up would make them more reusable and less magical. In many ways they are custom rules, and with GHC's new type level literals, it might be more feasible to give enough sugar so that defining real rules is so simple that oracles aren't required.

The lint check is defined to rerun storedValue and see if the value matches. For Oracles, that never matches, as there is no storedValue, and it would be hard to see how you could get one with the current API, unless you modified the approach and went from something other than storedValue.