plow-technologies / inferno

A statically-typed functional scripting language
MIT License
4 stars 1 forks source link

Fix VC history operation and add VC tests #49

Closed siddharth-krishna closed 1 year ago

siddharth-krishna commented 1 year ago

This PR fixes three bugs in the fetchVCObjectHistory operation:

The first is in the order of the history of a VC object. Currently, given a script with 4 versions s1 -> s2 -> s3 -> s4, fetchVCObjectHistory returns [s3, s2, s1, s4]. This is because we store history in oldest-to-newest order in file, but the following line adds head_h (i.e. the latest version s4) to the wrong end of the history list: https://github.com/plow-technologies/inferno/blob/73c59f13b4070323946cc370015e059eb043d2ab/inferno-vc/src/Inferno/VersionControl/Operations.hs#L339-L343 (the foldM below then reverses the order, but with s4 still in the wrong place)

Even after fixing the above, fetchVCObjectHistory incorrectly adds a second source script in the case of a script that is cloned twice in succession. Consider the following history:

mcJDUg CloneOf F6LRpH
4I47d_ CloneOf mcJDUg
cUpepL MarkedBreakingWithPred 4I47d_

In this case, fetchVCObjectHistory cUpepL returns [F6LRpH, mcJDUg, 4I47d_, cUpepL] (after the above fix) but it should return [mcJDUg, 4I47d_, cUpepL]. Interestingly, the above bug hides this bug in production right now because because history = [cUpepL, mcJDUg, 4I47d_] (incorrect), so metas = [4I47d_, mcJDUg, cUpepL] and so when it finds that 4I47d_, is a clone it adds original = mcJDUg and returns nubBy [mcJDUg, 4I47d_, mcJDUg, cUpepL] = [4I47d_, mcJDUg, cUpepL]. But once I fix history = [cUpepL, 4I47d_, mcJDUg], then metas = [mcJDUg, 4I47d_, cUpepL] and original = F6LRpH and the code returns [F6LRpH, mcJDUg, 4I47d_, cUpepL]. This is incorrect because we only want to add on the original version of the first cloning operation when going back in history.

While fixing this second bug, I also noticed that nubBy (a linear operation) is unnecessary, as one can simply recurse through history and stop when one sees the first cloning operation. I thus simplified the fold function and the large case-split into a single recursive function getHistory that walks through history newest-to-oldest, fetches the VC objects, and stops when it sees a cloning operation.

NOTE: This PR requires a code change downstream. There is a third issue with the current system: since the frontend requires the history in newest-to-oldest order, but the current code returns the order [s3, s2, s1, s4], OnPing is reversing the order it gets from the VC to put s4 first. This PR fixes the VC to return history in newest-to-oldest order, so OnPing should not reverse the order that it gets.

I also benchmarked a few different ways of doing this operation (see below) and found that the fastest way would be to use a recursive function instead of a fold, and return the history in newest-to-oldest order.

Test cases:

// iPFmvYX2GLvC9LYLxRW0HogFrhw188s7bwMguELJ0_c=
KyZIcL MarkedBreakingWithPred tHV1iC
wwWlxB CloneOf KyZIcL
_9eHVU MarkedBreakingWithPred wwWlxB
Vbke7C MarkedBreakingWithPred _9eHVU
iPFmvY MarkedBreakingWithPred Vbke7C
// Expected: [KyZIcL, wwWlxB, _9eHVU, Vbke7C, iPFmvY]

// 0yfJRn3k_14jJWPVMjFEBTHwgnYyBcgTJYU4ThJqHtM=
Rxq6af CloneOf F0j9ZH
26Brml MarkedBreakingWithPred Rxq6af
t5jCXh MarkedBreakingWithPred 26Brml
0yfJRn MarkedBreakingWithPred t5jCXh
// Expected: [F0j9ZH, Rxq6af, 26Brml, t5jCXh ,0yfJRn]

// cUpepLZIHur8wRKkx5Wc8IrvfLmNl0V_pnz9C2T9l54=
mcJDUg CloneOf F6LRpH
4I47d_ CloneOf mcJDUg
cUpepL MarkedBreakingWithPred 4I47d_
// Expected: [mcJDUg, 4I47d_, cUpepL]

I tested the new code on the above histories (heads/ files) and it returned the correct responses.

Benchmarking different ways of recursing through history

module Main where

import Data.Foldable (foldrM)
import System.Environment (getArgs)

_foldrM :: Int -> IO ()
_foldrM n = do
  let l1 = [1 .. n]
  let l2 = l1 ++ [n + 1]
  l3 <- foldrM process [] l2
  print $ sum $ reverse l3
  where
    process x acc = do
      if x > 5
        then do
          -- print x
          pure $ x : acc
        else pure acc

_rec :: Int -> IO ()
_rec n = do
  let l1 = [1 .. n]
  let l2 = n + 1 : reverse l1
  l3 <- processRec [] l2
  print $ sum $ reverse l3
  where
    processRec acc (x : xs) = do
      if x > 5
        then do
          -- print x
          processRec (x : acc) xs
        else pure acc
    processRec acc [] = pure acc

_rec2 :: Int -> IO ()
_rec2 n = do
  let l1 = [1 .. n]
  let l2 = n + 1 : reverse l1
  l3 <- processRec l2
  print $ sum l3
  where
    processRec (x : xs) = do
      if x > 5
        then do
          -- print x
          res <- processRec xs
          pure $ x : res
        else pure []
    processRec [] = pure []

{-
Run as: <exe> <method> <n>

I benchmarked with:
hyperfine -L n foldrM,rec,rec2 'cabal run inferno {n} 50000000'

And rec2 was 1.2x faster than the other 2 methods.
-}
main :: IO ()
main = do
  getArgs >>= \case
    [m, n'] -> do
      let n = read n'
      case m of
        "foldrM" -> _foldrM n
        "rec" -> _rec n
        "rec2" -> _rec2 n
        _ -> undefined
    _ -> error "expected args: method n"
siddharth-krishna commented 1 year ago

Thanks for the review! I noticed we don't have tests for inferno-vc anywhere so I added some simple tests. Hope it catches small bugs in the future.