DataHaskell / dh-core

Functional data science
138 stars 23 forks source link

analyze: generate and check random test data #37

Closed ocramz closed 5 years ago

ocramz commented 5 years ago

Text fixtures (e.g. analyze/test/Fixtures.hs )could be gradually replaced by test properties, based on quickcheck, hedgehog or genvalidity.

Magalame commented 5 years ago

I think I'd like to try that one too. If I understand properly, the idea is to replace every unit test by a property test? So in short, writing the Arbitrary instances, and then the actual test part?

ocramz commented 5 years ago

@Magalame yep!

Magalame commented 5 years ago

@ocramz So I started writing the tests (it's a bit messy, I'll organise it more cleanly later): https://github.com/Magalame/dh-core/blob/replace-fixtures/analyze/test/Spec2.hs

It's my first time really experimenting with QuickCheck and the like, so I was wondering if looked good so far?

Magalame commented 5 years ago

Almost completely set up, only missing the oneHot part. I think I might be doing something wrong because the performance of some tests might seem a bit poor, with a x15 slowdown.

ocramz commented 5 years ago

@Magalame um, that sounds strange. Which tests run so slowly?

Magalame commented 5 years ago

When running them, I get that:

Test suite
  Fixture:        OK (0.10s)
    +++ OK, passed 100 tests.
  Row Decode:     OK (0.15s)
    +++ OK, passed 100 tests.
  Drop:           OK (0.11s)
    +++ OK, passed 100 tests.
  Keep:           OK (0.10s)
    +++ OK, passed 100 tests.
  Update Empty:   OK (1.83s) #all these going down
    +++ OK, passed 100 tests.
  Update Empty 2: OK (1.67s)
    +++ OK, passed 100 tests.
  Update Add:     OK (1.72s)
    +++ OK, passed 100 tests.
  Update Overlap: OK (1.79s)
    +++ OK, passed 100 tests.
  Take Rows:      OK (0.95s)
    +++ OK, passed 100 tests.
  Add Column:     OK (2.01s)
    +++ OK, passed 100 tests.

All 10 tests passed (10.43s)

I tried to profile it, here's the .prof: https://ufile.io/1lqia . However it looks a bit cryptic to me.

Magalame commented 5 years ago

I think I'm having trouble dealing with oneHot. Here is an example of what I obtain with stack runghc Spec2.hs:

-----------------------
Original Update: RFrameUpdate {_rframeUpdateKeys = ["","zlh"], _rframeUpdateData = [[ValueDouble (-2.0136650956085296),ValueText "rs"],[ValueDouble 0.19461704868628563,ValueText "mx"]]}
-----
Original data: [[ValueDouble (-2.0136650956085296),ValueText "rs"],[ValueDouble 0.19461704868628563,ValueText "mx"]]
-----
Key to test: ""
-----
True/false value: ValueDouble 0.19461704868628563/ValueDouble (-2.0136650956085296)
-----
Expected result: RFrame {_rframeKeys = ["ValueDouble (-2.0136650956085296)","ValueDouble 0.19461704868628563"], _rframeLookup = fromList [("ValueDouble 0.19461704868628563",1),("ValueDouble (-2.0136650956085296)",0)], _rframeData = [[ValueDouble 0.19461704868628563,ValueDouble (-2.0136650956085296)],[ValueDouble (-2.0136650956085296),ValueDouble 0.19461704868628563]]}
-----
From A.oneHot: RFrame {_rframeKeys = ["zlh","ValueDouble (-2.0136650956085296)","ValueDouble 0.19461704868628563"], _rframeLookup = fromList [("ValueDouble 0.19461704868628563",2),("ValueDouble (-2.0136650956085296)",1),("zlh",0)], _rframeData = [[ValueText "rs",ValueDouble 0.19461704868628563,ValueDouble (-2.0136650956085296)],[ValueText "mx",ValueDouble (-2.0136650956085296),ValueDouble 0.19461704868628563]]}

There seems to be an extra "zlh" column.

Magalame commented 5 years ago

Bump

ocramz commented 5 years ago

Hi @Magalame , I don't have time to look at this right now. Could you see what test is broken and perhaps submit a patch for it?

Magalame commented 5 years ago

Sounds good!

Magalame commented 5 years ago

So, looking at the source code, the problem is this line : update hot cold in https://github.com/DataHaskell/dh-core/blob/master/analyze/src/Analyze/Ops.hs It leads to the one-hot encoding, and the other columns being concatenated together. Is this the actual purpose of one-hot, or should we output the one-hot encoding only?

Magalame commented 5 years ago

Bump

ocramz commented 5 years ago

Addressed in #53 . Thank you @Magalame !

Magalame commented 5 years ago

Thank you for your help!

Magalame commented 5 years ago

NB: The "speed issue" isn't an issue. Two things are at play: the inherent slowness of the Generators, and lazyness. The second allowed us to only partially evaluate the frames. So the slow case is the "normal" case, and the faster ones are such just because we don't fully evaluate the frames. It's easily checkable with playing with mersenne generators, and deepseq.

ocramz commented 5 years ago

Ah, that's good to know! Would you be able to add some benchmarks showing this? They would be very appreciated ^^

On Tue, Apr 30, 2019 at 7:26 AM Magalame notifications@github.com wrote:

NB: The "speed issue" isn't an issue. Two things are at play: the inherent slowness of the Generators, and lazyness. The second allowed us to only partially evaluate the frames. So the slow case is the "normal" case, and the faster ones are such just because we don't fully evaluate the frames. It's easily checkable with playing with mersenne generators, and deepseq.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/DataHaskell/dh-core/issues/37#issuecomment-487824897, or mute the thread https://github.com/notifications/unsubscribe-auth/ABNBDKGVTSQMVP3JRANZY4DPS7KAZANCNFSM4GP37DBA .

Magalame commented 5 years ago

I came up with this so far. I just took two tests from the property test suit. benchEmpty needs to fully evaluate the data, while benchFixtures does not

module Main where

import qualified Analyze                  as A
import Analyze.RFrame (RFrameUpdate (..))

import qualified Data.Text           as T
import           Data.Text           (Text)

import qualified Data.Vector         as V
import           Data.Vector         (Vector)

import System.Random
import Control.Monad
import Control.DeepSeq

import qualified System.Random.MWC as M

import qualified Criterion.Main as C

n :: Int
n = 1000

testKeys ::  IO (Vector Text)
testKeys =  V.replicateM n $ liftM (T.pack . take 10 . randomRs ('a','z')) newStdGen

testData ::  IO (Vector (Vector Double))
testData =  V.replicateM n $ liftM (V.fromList . take n . randomRs (-1,1)) newStdGen

testDataMersenne :: IO (Vector (Vector Double))
testDataMersenne = do 
    gen <- M.create
    V.replicateM n $ M.uniformVector gen n

benchEmpty :: IO (Vector Text) -> IO (Vector (Vector Double)) -> IO Bool
benchEmpty keysgen datagen = do 

    keysb <- keysgen
    datab <- datagen

    let 
      update = RFrameUpdate keysb datab

    expected <- A.fromUpdate update

    let
      lengthEmpty = length $ A._rframeUpdateKeys update
      emptyUpdate = RFrameUpdate V.empty (V.replicate lengthEmpty V.empty)

    empty <- A.fromUpdate emptyUpdate
    actual <- A.update update empty

    return $ actual == expected

benchFixture :: IO (Vector Text) -> IO (Vector (Vector Double)) -> IO Bool
benchFixture keysgen datagen = do 

                   keysb <- keysgen
                   datab <- datagen

                   let
                     update = RFrameUpdate keysb datab

                   frame <- A.fromUpdate update 
                   let
                      -- get keys from both the update and the frame
                      keys = A._rframeKeys frame
                      keysUp = A._rframeUpdateKeys update 

                      -- gets data from both
                      nbRows = A.numRows frame
                      nbRowsUp = length $ A._rframeUpdateData update

                      -- number of colums from both
                      nbCols = A.numCols frame
                      nbColsUp = length $ A._rframeUpdateKeys update

                   -- checks everything is the same for both 
                   return $ (keys == keysUp) && (nbRows == nbRowsUp) && (nbCols == nbColsUp)

main :: IO()
main = C.defaultMain [ C.bgroup "Tests" [ C.bench "empty"   $ C.whnfIO (benchEmpty testKeys testData)           
                                        , C.bench "fixture" $ C.whnfIO (benchFixture testKeys testData)
                                        , C.bench "forced fixture"  $ C.whnfIO (benchFixture testKeys (fmap force testData))
                                        , C.bench "mersenne empty" $ C.whnfIO (benchEmpty testKeys testDataMersenne)]]

It gives:

benchmarking Tests/empty
time                 1.363 s    (1.060 s .. 1.800 s)
                     0.989 R²   (0.968 R² .. 1.000 R²)
mean                 1.227 s    (1.143 s .. 1.300 s)
std dev              103.1 ms   (53.77 ms .. 131.9 ms)
variance introduced by outliers: 22% (moderately inflated)

benchmarking Tests/fixture
time                 4.098 ms   (3.607 ms .. 4.527 ms)
                     0.899 R²   (0.844 R² .. 0.938 R²)
mean                 5.786 ms   (5.411 ms .. 6.160 ms)
std dev              1.137 ms   (958.6 μs .. 1.407 ms)
variance introduced by outliers: 87% (severely inflated)

benchmarking Tests/forced fixture
time                 1.182 s    (1.129 s .. 1.259 s)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 1.221 s    (1.201 s .. 1.233 s)
std dev              20.13 ms   (7.737 ms .. 27.70 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Tests/mersenne empty
time                 86.12 ms   (83.85 ms .. 88.79 ms)
                     0.998 R²   (0.995 R² .. 1.000 R²)
mean                 93.49 ms   (90.28 ms .. 99.31 ms)
std dev              6.993 ms   (2.581 ms .. 9.141 ms)
variance introduced by outliers: 19% (moderately inflated)