magefree / mage

Magic Another Game Engine
http://xmage.today
MIT License
1.91k stars 779 forks source link

Test Coverage tracking and enforced/encouraged testing #6875

Closed fearphage closed 3 years ago

fearphage commented 4 years ago

There have been a string of regressions lately (1, 2, 3). I wanted to add code coverage tracking to the project, but I'm not familiar with jUnit.

Test coverage reporting would allow (automatic tracking](https://github.com/sinonjs/nise/pull/164#issuecomment-628921429) and enforcement (if desired).

Requiring tests for code isn't a bad practice to adopt. Good test coverage ensures that you don't keep making the same mistakes. Instead you're free to make new mistakes.

These vendors offer free test coverage tracking to open source projects:

They both offer great documentation for using them with Github, TravisCI, etc.

JayDi85 commented 4 years ago

This is a big pain for project. Current code coverage is very low (~1-5% for all cards and 10-15% for game engine code).

fearphage commented 4 years ago

This is a big pain for project.

I'm not sure what you meant to say here. It's a big pain point? It's a pain to write tests? Could you elaborate please?

JayDi85 commented 4 years ago

People don't like to write tests (example: all new set cards implemented without any tests). Even official MTG rules aren't covered by tests. Despite the fact that XMage have excellent test framework for unit tests (cards testing).

How I see the starting steps to fix that problem (low code coverage):

  1. Implement unit tests for game examples from official rules (one test per example);
  2. Implement unit tests for game examples from third party articles about abilities/cards (example);
  3. Implement unit tests for abilities/effects (by cards or by custom cards -- test framework support any setup). Start from newest sets.
  4. Implement unit tests for single cards. Start from newest sets.
fearphage commented 4 years ago

People don't like to write tests

My initial thought/reaction was:

So what. ¯\_(ツ)_/¯

My response as a developer:

Are you included in that list of people? If you're not, then you and the others admins for the project can enforce that going forward. I push back on PRs all day personally (open source) and privately (at work) when the code coverage is lowered or they miss testing some functionality. If you want fewer regressions, that's how you get fewer regressions.

My response as user/consumer of Xmage:

I like using Xmage. I appreciate the work you and others do. I don't like when it's unusable for 2+ weeks due to weird regressions... even though it was fixed (at least partially) within 24 hours. Thus my query about making releases easier/faster (#6833) so the fixes could go out sooner/faster. (I would love if there was a daily/nightly build channel where I could get the latest and greatest so I could continue using the app.)

None of this is intended to be a shot at anyone and I mean no ill will. It's just that writing code without tests is error prone. Full stop. There's no way around that. I know that. I see it every single day. I'm a huge proponent of test-driven development. I use it myself and when I'm leading a project, I recommend it to others. It makes development a little slower (writing more code is slower than less), but it increases the velocity of the project over time since you can change things more easily and frequently with confidence that nothing is busted.

I was personally investigating writing tests (read: copying tests and testing more scenarios) and it's not easy to get that going as a non-Java developer. I haven't developed Java in 20+ years. I don't want to install IntelliJ and all the things that go along with that. I'd love to have a docker image that I just lets me run tests. I was trying to put that together myself, but it's been rough going so far. Would love some help/guidance on that if available. I can open an issue to discuss that separately though.

JayDi85 commented 4 years ago

XMage uses some auto-generated code so use maven's command first:

After that you can build and run unit tests or other things as normal. Run one test example:

ingmargoudt commented 4 years ago

I would like to be involved again with writing tests. Been a while since my last contribution so I might have some questions about additions to the test framework.

JayDi85 commented 4 years ago

@ingmargoudt you can find latest changes in test framework by commits history (search) or ask in github or gitter channel.

ingmargoudt commented 4 years ago

Which set would be best to start with? Jumpstart or Ikoria? @JayDi85

JayDi85 commented 4 years ago

M21 or C20. But you can choose any set or cards.

ingmargoudt commented 4 years ago

M21 sounds like a good start. Will look at that, thanks for the pointers!

fearphage commented 4 years ago

@JayDi85 Thanks for the help. I moved the build environment in docker conversation to a PR: #6876

jeffwadsworth commented 4 years ago

Enforced and encouraged are two very different words. The first: draconian. The second: diplomatic. The "shrugs" sign was a nice touch and fit the general tone of this issue. Tests are great and should be greatly encouraged. But unless you are paying the mob, don't expect them to be compliant.

fearphage commented 4 years ago

Enforced and encouraged are two very different words. The first: draconian. The second: diplomatic.

Your responses are a bit dramatic... to put it lightly. Do you truly believe being forced to write quality code is severe and harsh punishment? Are you a developer by trade? Working somewhere where they don't enforce testing is a rarity in my professional experience. The open-source environment reflects that as well. But if having software work reliably and predictably is draconian, then call me Draco I suppose.

The "shrugs" sign was a nice touch and fit the general tone of this issue.

Because why does it matter if people prefer to write error-prone code? Who should care that people don't want to write easy to maintain code?

But unless you are paying the mob, don't expect them to be compliant.

How do you explain people writing tests for hundreds of thousands of open source projects for decades?

JayDi85 commented 4 years ago

@fearphage 1 000 lines of code and 1 000 000 lines of code projects are different to maintain. There are no full time developers (until someone wants to pay for it). There are only few developers who makes changes regularly (on free time).

So you can't require full implemented unit tests for each PR (or otherwise it will be closed). Any change can be useful for project. Even more -- xmage is game project, not developer tool or library. So the skill requirements are very low (example: people without java skills can make fixes or improves). Therefore the criteria for the code should not be so radical.

fearphage commented 4 years ago

People write code for the first time all the time on GitHub. That's not unique to this project. If something is wrong with the contribution or someone thinks it can be improved, that's why GitHub has a whole code review workflow incorporated into pull requests. It's a great teaching tool.

I'm not sure what you're getting at. If you can write code for a card/format, you have the capabilities to write tests for that code. If you need assistance or guidance, that's what communication and/or documentation are for. A failed build (from a decrease in code coverage for instance) is enough to prompt most people to fix the problem in my experience.

Therefore the criteria for the code should not be so radical.

Requiring tests for code is a radical idea? That's a new one for me. One man's radical idea is another man's table stakes.

There are a bevy of services, web hooks, and other integrations setup for tests to run on GitHub, CircleCI, TravisCI, etc. Code coverage tracking and reporting are easily integrated into many of those tools and services. Those things are already well-documented how to integrate them with GitHub. The concept of testing code is neither new nor surprising.

ingmargoudt commented 4 years ago

I agree with @fearphage . The risk on regression in xmage is very large, due to the amount of lines of code, and the complexity of the implementation of the magic rules / framework. Regression should be aimed to find during development, rather than during production, which can harm the project even though it's free to use. That nobody cares to write tests or that the codebase is too large does not mean that it should not be done. No need to aim for a 100% coverage, but any test will contribute. Correct tests will only make the code base stronger.

I really would suggest a jacoco / codecov combination that runs after each travis run. The numbers coming out of that might be confronting, but it should start with something..

JayDi85 commented 4 years ago

Requiring tests for code is a radical idea? That's a new one for me. One man's radical idea is another man's table stakes. There are a bevy of services, web hooks, and other integrations

Even if you activate dozen tools, there still won't be a single line of new code. No developers -- no code. It's easy to understand. Current xmage problem in lack of developers, not tools.

fearphage commented 4 years ago

Even if you activate dozen tools, there still won't be a single line of new code. No developers -- no code. It's easy to understand. Current xmage problem in lack of developers, not tools.

But there are 16 PRs open right now from roughly 12 unique individuals right this second. I haven't opened each PR, but I suspect there is more than one single line of new code in them.

This repo/code base has numerous problems. One of them is regressions. That's the problem this particular issue is attempting to address. This ticket is not meant to fix all the problems present in this repro/code base.

JayDi85 commented 4 years ago

But there are 16 PRs open right now from roughly 12 unique individuals right this second.

You were looking at the wrong place. It's old PRs with different problems (e.g. it must be researched or improved before merge). Real developers activity can be found on pulse page: shot_200728_221025

Regressions. That's the problem this particular issue is attempting to address.

"Rejects PRs or code without tests" is not solution -- this is a new headache for maintainers and authors. Even your example from https://github.com/sinonjs/nise/pull/164 is failed -- problem isn't fixed 3 months (cause it requires one missing test for full code coverege).

Let me explain again: I like the code coverage stats. But I'm against your scheme of work (tests or nothing) -- it's will not work in this big legacy project (see explains in comments above: 1 and 2).

fearphage commented 4 years ago

"Rejects PRs or code without tests" is not solution -- this is a new headache for maintainers and authors.

So the focus/goal of this project is developer convenience/happiness at the expense of the project being useful/usable by the end users for extended periods of time. Would you agree with that statement?

There's nothing wrong with that if this is a project everyone wants to code on for funsies and they aren't concerned when and how frequently it breaks. I have projects like that myself. I'm just verifying that that is the case. It would be nice to have the right expectations going forward.

Even your example from sinonjs/nise#164 is failed -- problem isn't fixed 3 months (cause it requires one missing test for full code coverege).

Yeah, it was a new feature. Neither the contributor nor the other devs on the project thought it was important enough to finish so no new feature. Oh well. That's not the end of the world. However during that 5 month time period, the library was released without regressions. That's the difference.

Let me explain again: I like the code coverage stats. But I'm against your scheme of work (tests or nothing) -- it's will not work in this big legacy project (see explains in comments above: 1 and 2).

If (one of?) the lead developers is anti-testing, it will not work. Your'e absolutely correct. It takes a concerted effort to right the ship. If you're not onboard, I understand entirely. Some people like to roll the dice with each release. Different strokes for different folks and all that jazz. Again I'm just trying to set my expectations going forward.

JayDi85 commented 4 years ago

So the focus/goal of this project is developer convenience/happiness at the expense of the project being useful/usable by the end users for extended periods of time. Would you agree with that statement?

Yes, that's right. Moreover it's about development process and devs happiness too (devs must have fun and satisfaction -- it's not the work, it's a hobby).

they aren't concerned when and how frequently it breaks

It's lack of information about the true story. If you look at breaking issues discussion like #6684 or #6698 then you can find that it was broken by years. New mechanic just reveal the problem (for TDD guys it's like the new batch of breaking unit tests).

However during that 5 month time period, the library was released without regressions.

You are compare incomparable projects and situations. Lib got only 2 bug fixes for whole 5 months period whereas XMage got x200 times more changes.

If (one of?) the lead developers is anti-testing, it will not work. Your'e absolutely correct. It takes a concerted effort to right the ship. If you're not onboard, I understand entirely.

I'm one of the developer who like tests and write it actively in that project. So I know what I'm talking about.

LevelX2 commented 4 years ago

Testing is important and a good thing and I think nobody here has anything against higher test coverage. Many complex changes would not be possible without the existing tetst, because the dependencies are not manageable for anyone.

My point is: 1) Displaying the coverage by tests through a tool is interesting and if possible we should include such a tool if it does not disturb or hinder development. And basically I agree that the more tests the better. 2) I do not think it makes sense to force the creation of tests via a tool or organizational rules. In general, I think it is completely impossible to cover all possible interaction cases with tests in this project, since there are an almost infinite number of possible combinations between the cards and their usage. Very often, when creating tests based on bug reports, you see that in most cases you would not have provided for such a combination, even if you had created 100 tests for this one new map in advance. 3) Yes, the project must be fun for the developers. As JayDi85 says, he already cares a lot about testing and I too have already created a lot of test cases. But I don't want to be forced to spend 90% of my time for testing. But everyone who wants to expand the testing area is from my point of view invited to do so. As you can see here, suggestions are discussed and if possible implemented.

fearphage commented 4 years ago

It's lack of information about the true story.

You can't have two #1 priorities. You just acknowledged that the focus of the project was developer happiness at the expense of the application being unusable for a several weeks here and there. Developer fun trumps stability and reliability. Please correct me if I'm misunderstanding your response.

You are compare incomparable projects and situations. Lib got only 2 bug fixes for whole 5 months period whereas XMage got x200 times more changes.

If they're incomparable, why'd you compare them in the first place? I don't understand your logic.

It doesn't matter if there is 1 commit or 1 million commits. Good test coverage doesn't care. This is not the largest open source project. It's not the one with the most contributors. Nothing makes this project more unique than all the other OSS projects with required tests (Linux, Android, Sinon, jQuery, React, etc).

In general, I think it is completely impossible to cover all possible interaction cases with tests in this project, since there are an almost infinite number of possible combinations between the cards and their usage.

Who said you'we have to? I'm talking about regressions. Tests stop things from breaking the same way repeatedly (read: regressing). Whenever something breaks, it should be a new bug -- not revisiting old bugs. That's the goal of tests and test coverage. There are guaranteed to be bugs, but they shouldn't be the same ones you/someone fixed last week, month, year, etc.

I think someone said the test coverage is in single digits (percentage wise). Not allowing it to decrease in PRs would (by ways of math) increase or maintain the current level of test coverage. So going forward, things that work would continue to work (read: not regress). Then whenever you fix a bug, add a card, etc., you add tests for that code. That's the only way to be sure the code is works a s expected.

But I don't want to be forced to spend 90% of my time for testing.

This seems like a false dilemma. Writing tests does not add 9x the work of writing code.

JayDi85 commented 3 years ago

Test coverage support added by d481172eb4ce193e84156a88339dfdddb2865af9. It supports JaCoCo engine and reports. Also it support Sonar Cloud. See example in comments section and pom-file.

Test coverage is very expensive task (1.5 million line of code do a dirty work), so it runs manually on the machine and upload reports to the cloud. See Sonar repository with it: https://sonarcloud.io/dashboard?id=JayDi85_mage

fearphage commented 3 years ago

Test coverage support added

Great news! :partying_face: :tada: You love to see it.