Use Diff instead of (or in addition to) the external diff command

achudnov commented 11 years ago

There's a native Haskell implementation of the diff algorithm in package Diff. For portability, it would be great to be able to use that implementation instead of relying on external tools. If you're interested, I can submit a pull request with an implementation.

UnkindPartition commented 11 years ago

Yes, that would be great. I wanted to do that for the very beginning, but at the time Data.Algorithm.DiffOutput didn't exist.

Now it's just about building the right interface. I don't think we can deprecate the external diff version. So that means adding two more functions to Test.Tasty.Golden.

How to name them? I'm inclined to rename the existing functions to goldenVsFileDiffExternal and call the new functions goldenVsFileDiff. That's because using the internal diff should be the default, unless the user has some special requirements. The downside is that this will break existing code. Any better ideas?

achudnov commented 11 years ago

The downside is that this will break existing code.

Fine with me as long as the major version is bumped.

Any better ideas?

Well, I'm not sure if it's "better" idea or not, and it's definitely a matter of taste. But, Test.Tasty.Golden exposes a few functions that are just minor variations of each other, which seems redundant. To me, something like what I wrote below seems like a cleaner interface. This is just a quick sketch, so it could and should be refined. But, I hope, you get the idea.

goldenTest :: TestName
           -> FilePath -- ^ golden file
           -> TestProducer -- ^ how to produce the test output
           -> Diff -- ^ whether and how to perform the diff
           -> TestTree
goldenTest name golden test diff = ...

data TestProducer = ProducerIO (IO ByteString)
                            |  ProducerFile FilePath (IO ())

data Diff = DiffNone | Diff | DiffExternal (FilePath -> FilePath -> [String])

One could go even further and add Default instances for TestProducer and Diff.

UnkindPartition commented 11 years ago

The whole point of the functions in Test.Golden is to provide an interface which is very easy to use in your test suite. It's kind of a simple DSL. Otherwise, you can simply use goldenTest from Test.Golden.Advanced which covers all imaginable cases.

achudnov commented 11 years ago

Yes, but goldenTest from Advanced is a bit too low-level. IMO, an API consisting of many functions that differ only in minor details is more confusing than what I've suggested. But this is your package, so it's up to you. Let me know if you still would like to have a pull request with the implementation using Diff.

UnkindPartition commented 11 years ago

IMO, an API consisting of many functions that differ only in minor details is more confusing than what I've suggested.

I agree. What I'm saying is, you should look at this more like at a DSL than an API. A testing function will probably be called many times, and it's inconvenient to have this ProducerFile wrapping on every line.

You could define aliases locally in every test suite, but I don't want that to be necessary.

Of course, I don't want too many variations, but in this case I'm afraid we have to add a couple more.

If this was a library API, however, then I would totally support your solution.

Let me know if you still would like to have a pull request with the implementation using Diff.

Sure!

achudnov commented 11 years ago

As it stands now, I'll need to add utf8-string as an extra dependency to convert from ByteString to String, because diff pretty-printing in Diff only works with Strings right now. As I have also noted in the retracted pull request, I expect the ByteString -> String conversion to be undesirable. However, that's the best we could do until Diff is patched to allow ByteStrings in ppDiff. Please, let me know if you'd like me to submit a PR anyway, or if you'd like to wait until someone patches Diff to work with ByteStrings.

EDIT: Another solution would, of course, be to use String IO instead of ByteString IO.

UnkindPartition commented 11 years ago

While utf8-string indeed does «conversion» from ByteString to String, it is the wrong type of conversion, semantically. It can only decode utf-8 encoded text, while we need it to handle arbitrary bytestrings.

The right kind of conversion would be to escape non-printable bytes using e.g. showLitChar from GHC.Show. This would be inconvenient for non-ascii text golden files, but I expect it to be a rare use case (and then you can still use lower-level interface to implement the desired semantics).

achudnov commented 11 years ago

The right kind of conversion would be to escape non-printable bytes using e.g. showLitChar from GHC.Show. This would be inconvenient for non-ascii text golden files, but I expect it to be a rare use case (and then you can still use lower-level interface to implement the desired semantics).

So, would you like a pull request with the code that uses that approach? I think that a better solution would be to patch Diff to support ByteStrings, but I don't have time to do this. In fact I have already implemented the same thing for my project, but using String IO instead of the ByteString one.

UnkindPartition commented 11 years ago

So, would you like a pull request with the code that uses that approach?

Yes please.

I think that a better solution would be to patch Diff to support ByteStrings

It's the question of the semantics, not data representation. (Conversion between data structures doesn't cost you anything in this context.) Do you think a better solution would be to build escaping into the Diff library? I don't know, and don't want to argue about that — that's what the Diff maintainers are for :-) Regardless, I feel that in this case escaping is what needs to be done.

sapek commented 9 years ago

Is anyone working on this? Portable diff is a pretty important feature for a golden test framework.

UnkindPartition commented 9 years ago

Not as far as I can tell. Feel free to dig in.

sapek commented 9 years ago

I'm currently using Data.Algorithm.DiffContext and Text.PrettyPrint to display diffs, something along these lines (taken out of context but you get the idea):


diff :: BytesString -> ByteString -> String
diff x y = render $ prettyContextDiff
                            (text golden)
                            (text "test output")
                            (text . BS.unpack)
                            (getContextDiff 3 (BS.lines x) (BS.lines y))

I think a question before was about converting ByteString to String in order to display it. Currently goldenVsString uses show but it is not always the best choice because it will escape " for example. I'm not sure there is one covertion that is always right. What do you think about leaving it to the user just like prettyContextDiff does?


goldenVsStringDiff
    :: TestName 
    -> FilePath 
    -> IO ByteString    
    -> (ByteString -> Doc)
    -> TestTree

UnkindPartition commented 9 years ago

It could be left for the user as long as we provide a sensible function to plug in there -- something along the lines that I described above. show is not a sensible function in this context.

sapek commented 9 years ago

Simply BS.unpack should work for ASCII text files (using ByteString.Char8). For files in specific encoding there are several ByteString -> Text functions that could be used before converting Text to String.

UnkindPartition commented 9 years ago

I was talking about binary files.

sapek commented 9 years ago

The diff is line-wise so it doesn't make sense for binary files anyways.

UnkindPartition commented 9 years ago

Ok, fair enough.

ssadler commented 7 years ago

Just a side note, currently if you use ["colordiff", "-u", ref, new] as your diff command you get a nice colorized diff in the test output :smile:. It's indented which looks a bit weird, but otherwise it looks really good.

michaelpj commented 6 years ago

I'm interested in picking this up. @sapek I notice you have some commits in your fork moving towards this - do you mind if I build off them?

sapek commented 6 years ago

That would be great. I haven't tried my changes with the current code but hopefully they are still somewhat relevant.

UnkindPartition / tasty-golden

Use Diff instead of (or in addition to) the external diff command #2