RFC: Ide test integration

santiweight commented 1 year ago

This is a proposal for integrating various testing frameworks' output format so that they can be used in IDEs that support Haskell. There is a design for two APIs for test frameworks: getTestTree which outlines the test tree of tests in a test suite, and runTest which allows for a test group or test case to be run from the test tree.

Feedback is very helpful, especially from testing framework maintainers, so that I can better understand what use cases are most important and we don't miss anything important.

Rendered

david-christiansen commented 1 year ago

One more question: how do we expect the IDE to get from the current filename to the name of the test suite executable to be run? I know how HLS could do this, but the proposal is written to be independent of HLS as far as I can see.

michaelpj commented 1 year ago

I do think this proposal would be more likely to get adoption if it was more abstract. Sketching some ideas:

class TestSpecification t where
    type TestResult :: *
    isSuccess :: TestResult t -> Bool
    details :: TestResult t -> Text

    type TestKey :: *
    enumerateTests :: t -> [TestKey t]
    runTest :: TestKey t -> IO (TestResult t)

class (TestSpecification t, TestKey t ~ NonEmpty Fragment) => HierarchicalTestSpecification t where
    type Fragment :: *

type TestReport t = Map (TestKey t) (TestResult t)

produceTestReport :: (TestSpecification t) => t -> IO (TestReport t)
produceTestReport t = fromList <$> for (enumerateTests t) $ \key -> do
    report <- runTest key
    pure (key, report)

This is 5 minutes of thinking, but I think this would give people a lot of flexibility.

gbaz commented 1 year ago

In our last TWG meeting we talked about this a bit. I think there's generally favorable sentiment. One big sticking point that came us is that while mainly "produce a test tree" is a mode existing test frameworks are good at, annotating it with source locations is not. There's an idea that HasCallStack constraints can be used to extend existing frameworks for this in a lightweight way, which would help resolve that. That said, I'd still be onboard with making such locations optional but encouraged.

santiweight commented 1 year ago

In our last TWG meeting we talked about this a bit. I think there's generally favorable sentiment.

Thanks for the response. I am still interested in following up on this, but was ill for the last two weeks.

I think the optional but encouraged nature of the test tree is exactly what we should push for. I do think that producing these locations is not particularly difficult with the HasCallStack.

Thankfully, we can still push for this just with the "test tree" on its own, since that is probably the best way to handle Haskell tests. And it seems clear that we can add optional SrcLoc annotations. So I think I will continue down this road.

I do think this proposal would be more likely to get adoption if it was more abstract.

I understand this sentiment, and I will consider how to be abstract.

There is a problem with abstractness however, is that a the test-plugin-for-ides level, we suddenly need a single joining point, where all the abstractness becomes concrete so we can display results etc.

I'm happy to go with something abstract, but that, for the time being, every TestResult t must be translatable to some PluginTestResult. Unfortunately, this removes a lot of the usefulness of a flexible TestResult. I'm not sure that maintainers are going to be inclined to use the flexible TestResult for uses other than the plugin, and the plugin requires something less abstract.

What use case would something abstract provide if not integration with a plugin?

david-christiansen commented 1 year ago

The question about abstraction was intended for @michaelpj , right?

michaelpj commented 1 year ago

One big sticking point that came us is that while mainly "produce a test tree" is a mode existing test frameworks are good at, annotating it with source locations is not.

Perhaps worth pointing out in the proposal that there are two ways of slicing this problem:

Run the test harness to discover tests, use this information to locate the tests in the source
Interrogate the source to discover tests, use this information to work out how to run them

The characteristic of approach 2 is using some kind of source annotation to identify tests. Approach 2 has the advantage that you don't need to complicate the runtime with source information. AFAICT, approach 2 is much more common in other languages: Java, Rust, Python etc., all have test frameworks that work primarily off source annotations. That's not to say we should do it, but perhaps worth being clear why not. (And we could do it: Haskell has annotations!)

There is a problem with abstractness however, is that a the test-plugin-for-ides level, we suddenly need a single joining point, where all the abstractness becomes concrete so we can display results etc.

Okay, so what does abstractness get us?

Richer test harness behaviour

We are passing this information across a serialization boundary, because we are running the discovery in one process, recording the results, and then feeding that back into the harness. If we serialize to JSON, then in one sense the real interface is that a TestTree is a Value! But this is much less informative than saying that a TestTree is some type specific to the test harness which can be converted to and from Value. This is nicer for the implementor to begin with, but also since the JSON is being produced and consumed by the same code they can pass through additional information easily:

MyTestCase
--> discover
JSON
--> run
MyTestCase

So test harnesses can assume they have their own richer TestTree type all the time, which is nice.

Flexibility

Having an abstract interface that just specifies exactly what we need makes it easier for other people to fit their own system into it. For example, it was only when sketching out an abstract version that I realised that it might be nice to just have a "flat" enumerateTests that would work just fine for test frameworks that don't arrange their tests hierarchically. We can then optionally enrich it with additional things if we have a hierarchical arrangement.

The risk, of course, is that it's over-engineered and nobody ever uses this additional power :shrug:

I'm happy to go with something abstract, but that, for the time being, every TestResult t must be translatable to some PluginTestResult.

Right, we need to be able to interpret the test results generically, that's true. This is a rare time when I find myself wishing for subtyping :D

Having written these things, I'm not feeling confident that I'm saying sensible things. I think I'd want to actually try it out before having more opinions :sweat_smile:

santiweight commented 1 year ago

Thanks for the response @michaelpj. I neglected to add that discussion about why Haskell is different, which was had in this discourse comment. You can also find in various points in that thread discussion around using Haskell's annotation system, which was shot down pretty hard.

Right, we need to be able to interpret the test results generically, that's true. This is a rare time when I find myself wishing for subtyping :D

I think this is why I want to put off your concerns of abstractions for follow up to this proposal. A majority of the benefit here is found in simply being able to run Haskell tests in VSC. And we should make sure to design something that is extensible to an abstract design without throwing away implementation work.

All this said: I get the feeling that an implementation is what's needed after this discussion. I will see if I can find the time over this Christmas to get an MVP working :)

gbaz commented 1 year ago

@santiweight be sure to check in with @davean who worked up a proof-of-concept of the HasCallStack approach that you could leverage.

david-christiansen commented 1 year ago

@santiweight How are things going with this proposal? Is there something we can do to get it unstuck?

santiweight commented 1 year ago

Hey thanks for checking in. Just finished a three week holiday today. I don't think there's anything blocking me right now.

@gbaz How might I check in with @davean? I don't know what project you're referring to.

My goal for the future is getting an MVP plugin for VSC. I don't think it will be too hard, and I'll surely take a lot of shortcuts.

Some overall thoughts:

I am personally happy with the source location annotations using HasCallStack and will be using that for MVP
I will make a very monomorphic data structure and target exclusively tasty test suites
I don't expect an MVP to be too challenging, and mostly about learning VSC's typescript plugins

From the current discussion, I want to implement and iterate. I get the sense that trying to corral Haskell's ecosystem ahead of time is premature (but I expect we will want to do some corralling down the line...).

gbaz commented 1 year ago

I think david will send an email connecting you. What he has is basically a proof of concept, and if you already have a HasCallStack proof of concept that works equally well, you may not get much out of his -- at the time I suggested you two connect, it was not clear to me how much (if any) progress you had made in that regard. I look forward to seeing what you come up with!

santiweight commented 1 year ago

I've made some good progress. Current support is for running tests with a patched version of tasty that produces a JSON object of available tests. The tests are subsequently run via command line with a filter matching the requested test.

https://user-images.githubusercontent.com/16826028/213981133-70ef0cc5-37d8-443d-92c1-d19f75f8c676.mov

santiweight commented 1 year ago

There is a small wrinkle that I'm not sure we can overcome without enabling deep subsumption.

In order to have a location attached to the correct source location, we need to have a HasCallStack in scope for the declaration containing a test definition.

In order to achieve this, we could define:

data TestTreeInner = TestCase | TestGroup
type TestTree = HasCallStack => TestTreeInner

instead of the current:

data TestTree = TestCase | TestGroup

It's not ideal imo. I'm open to suggestions here.

@gbaz @davean btw I never received an email afaik.

santiweight commented 1 year ago

I also have locations in the text working...

https://user-images.githubusercontent.com/16826028/213983525-f1563218-d1f5-46a1-bca9-e2ed7001bce5.mov

michaelpj commented 1 year ago

Nothing specific to say except that looks super cool!

santiweight commented 1 year ago

Help needed and welcome!

There are a few tasks left to do that I don't know how to resolve.

I need to find a way to get the available test projects in a VSC workspace. Subsequently I need to add better ~~caching~~ file-watching based on the files in a test project. Is this what hie-bios is for?
Preferably we can find a way to avoid the need for a HasCallStack constraint on the function that calls testCase or testGroup. Above I suggested a solution that requires DeepSubsumption which is not satisfactory imo...

Any input on either of these? If not then I'll just go and put out a crappy first try version.

fendor commented 1 year ago

Without having read the full context (sorry, just saw your question re hie-bios), hie-bios's main purpose is to find the compilation option given a filepath or some other way to identify a part of your project. Any caching it does is for performance reasons. HLS caches these components, and should know all the files it has loaded. What caching are you referring to in particular? (We can discuss it a bit more on libera)

santiweight commented 1 year ago

Thanks @fendor. I misspoke in my comment. I meant "file-watching" - so that I know when the user's test tree becomes stale. I'm not sure the best way to find either what test projects are available in a folder (they might not be at the top level of the folder), nor do I know how to do file-watching for a cabal project.

I'm assuming both of these use cases have been paved?

fendor commented 1 year ago

I'm not sure the best way to find either what test projects are available in a folder

For a cabal project, I think there is no built-in way to do it, unless something like https://github.com/haskell/cabal/pull/7500 lands. You can find the components via a query such as cat dist-newstyle/cache/plan.json | jq '."install-plan"|.[]|select(.style=="local")|select(."component-name" != null) | select(."component-name" | startswith("test:"))|."component-name"', which basically just queries plan.json for any local component whose component-name starts with test:. (or using cabal-plan directly)

nor do I know how to do file-watching for a cabal project

Personally, also not sure. I have some ideas, but I don't know of a way to do it right now.

santiweight commented 1 year ago

I tried to come back to this because I think it's really cool, but I don't know the way forward with respect to file watching.

Does anyone have any clue who we could reach out to about this?

It would be a shame to have this die, because it does work, and we're not terribly far away from an MVP.

david-christiansen commented 1 year ago

Well, I seem to have missed this final question - sorry. For file watching, I'd post the question on Discourse - I don't know a more specific forum to check.

I'm going to temporarily close this one - please feel free to reopen it if you want to keep the discussion going and make progress!

haskellfoundation / tech-proposals

RFC: Ide test integration #46