Closed achudnov closed 4 years ago
Yes, that would be great. I wanted to do that for the very beginning, but at the time Data.Algorithm.DiffOutput
didn't exist.
Now it's just about building the right interface. I don't think we can deprecate the external diff version. So that means adding two more functions to Test.Tasty.Golden
.
How to name them? I'm inclined to rename the existing functions to goldenVsFileDiffExternal
and call the new functions goldenVsFileDiff
. That's because using the internal diff should be the default, unless the user has some special requirements. The downside is that this will break existing code. Any better ideas?
The downside is that this will break existing code.
Fine with me as long as the major version is bumped.
Any better ideas?
Well, I'm not sure if it's "better" idea or not, and it's definitely a matter of taste. But, Test.Tasty.Golden
exposes a few functions that are just minor variations of each other, which seems redundant. To me, something like what I wrote below seems like a cleaner interface. This is just a quick sketch, so it could and should be refined. But, I hope, you get the idea.
goldenTest :: TestName
-> FilePath -- ^ golden file
-> TestProducer -- ^ how to produce the test output
-> Diff -- ^ whether and how to perform the diff
-> TestTree
goldenTest name golden test diff = ...
data TestProducer = ProducerIO (IO ByteString)
| ProducerFile FilePath (IO ())
data Diff = DiffNone | Diff | DiffExternal (FilePath -> FilePath -> [String])
One could go even further and add Default
instances for TestProducer
and Diff
.
The whole point of the functions in Test.Golden
is to provide an interface which is very easy to use in your test suite. It's kind of a simple DSL. Otherwise, you can simply use goldenTest
from Test.Golden.Advanced
which covers all imaginable cases.
Yes, but goldenTest
from Advanced
is a bit too low-level. IMO, an API consisting of many functions that differ only in minor details is more confusing than what I've suggested. But this is your package, so it's up to you. Let me know if you still would like to have a pull request with the implementation using Diff
.
IMO, an API consisting of many functions that differ only in minor details is more confusing than what I've suggested.
I agree. What I'm saying is, you should look at this more like at a DSL than an API. A testing function will probably be called many times, and it's inconvenient to have this ProducerFile
wrapping on every line.
You could define aliases locally in every test suite, but I don't want that to be necessary.
Of course, I don't want too many variations, but in this case I'm afraid we have to add a couple more.
If this was a library API, however, then I would totally support your solution.
Let me know if you still would like to have a pull request with the implementation using
Diff
.
Sure!
As it stands now, I'll need to add utf8-string
as an extra dependency to convert from ByteString
to String
, because diff pretty-printing in Diff
only works with String
s right now. As I have also noted in the retracted pull request, I expect the ByteString -> String
conversion to be undesirable. However, that's the best we could do until Diff
is patched to allow ByteStrings
in ppDiff
. Please, let me know if you'd like me to submit a PR anyway, or if you'd like to wait until someone patches Diff
to work with ByteStrings
.
EDIT: Another solution would, of course, be to use String
IO instead of ByteString
IO.
While utf8-string
indeed does «conversion» from ByteString
to String
, it is the wrong type of conversion, semantically. It can only decode utf-8 encoded text, while we need it to handle arbitrary bytestrings.
The right kind of conversion would be to escape non-printable bytes using e.g. showLitChar
from GHC.Show
. This would be inconvenient for non-ascii text golden files, but I expect it to be a rare use case (and then you can still use lower-level interface to implement the desired semantics).
The right kind of conversion would be to escape non-printable bytes using e.g. showLitChar from GHC.Show. This would be inconvenient for non-ascii text golden files, but I expect it to be a rare use case (and then you can still use lower-level interface to implement the desired semantics).
So, would you like a pull request with the code that uses that approach? I think that a better solution would be to patch Diff to support ByteStrings, but I don't have time to do this. In fact I have already implemented the same thing for my project, but using String
IO instead of the ByteString
one.
So, would you like a pull request with the code that uses that approach?
Yes please.
I think that a better solution would be to patch Diff to support ByteStrings
It's the question of the semantics, not data representation. (Conversion between data structures doesn't cost you anything in this context.) Do you think a better solution would be to build escaping into the Diff library? I don't know, and don't want to argue about that — that's what the Diff maintainers are for :-) Regardless, I feel that in this case escaping is what needs to be done.
Is anyone working on this? Portable diff is a pretty important feature for a golden test framework.
Not as far as I can tell. Feel free to dig in.
I'm currently using Data.Algorithm.DiffContext
and Text.PrettyPrint
to display diffs, something along these lines (taken out of context but you get the idea):
diff :: BytesString -> ByteString -> String
diff x y = render $ prettyContextDiff
(text golden)
(text "test output")
(text . BS.unpack)
(getContextDiff 3 (BS.lines x) (BS.lines y))
I think a question before was about converting ByteString
to String
in order to display it. Currently goldenVsString
uses show
but it is not always the best choice because it will escape "
for example. I'm not sure there is one covertion that is always right. What do you think about leaving it to the user just like prettyContextDiff
does?
goldenVsStringDiff
:: TestName
-> FilePath
-> IO ByteString
-> (ByteString -> Doc)
-> TestTree
It could be left for the user as long as we provide a sensible function to plug in there -- something along the lines that I described above. show is not a sensible function in this context.
Simply BS.unpack
should work for ASCII text files (using ByteString.Char8
). For files in specific encoding there are several ByteString -> Text
functions that could be used before converting Text
to String
.
I was talking about binary files.
The diff is line-wise so it doesn't make sense for binary files anyways.
Ok, fair enough.
Just a side note, currently if you use ["colordiff", "-u", ref, new]
as your diff command you get a nice colorized diff in the test output :smile:. It's indented which looks a bit weird, but otherwise it looks really good.
I'm interested in picking this up. @sapek I notice you have some commits in your fork moving towards this - do you mind if I build off them?
That would be great. I haven't tried my changes with the current code but hopefully they are still somewhat relevant.
There's a native Haskell implementation of the diff algorithm in package Diff. For portability, it would be great to be able to use that implementation instead of relying on external tools. If you're interested, I can submit a pull request with an implementation.