Why does test runner have unnecessary gems installed? (Parser)

joshgoebel commented 1 year ago

We received the following error when we ran your code:
Line 3: Parser is not a class
/usr/local/bundle/gems/parser-3.0.2.0/lib/parser.rb:19: previous definition of Parser was here (TypeError)

Expected behavior:

Tests should pass, same as they do locally - Parser is not a reserved class in Core Ruby.

kotp commented 1 year ago

Parser is used.

I do not understand what you mean by "reserved class".

joshgoebel commented 1 year ago

Parser is used.

How and by what?

Locally:

❯ ruby -v
ruby 3.0.4p208

❯ irb
irb(main):001:0> Parser
(irb):1:in `<main>': uninitialized constant Parser (NameError)

I'm running the tests with ruby wordy_test.rb... and naming a class Parser works fine locally - but fails in the test runner environment.

reserved class

I mean built ins like Array, Hash, etc...

kotp commented 1 year ago

These file use Parser and/or its subclass(s):

lib/extract_test_metadata.rb
lib/extract_metadata.rb

joshgoebel commented 1 year ago

Ok, gotcha. I wonder if we should agree on a "standard" process/environment/behavior for test runners? IE, that the environment should be the same as a simple local install? JS and Wren test runners accomplish this by just being Bash scripts that simple run the same commands one would locally...

Ruby is "breaking" this [pattern?] by loading a bunch of Ruby gems into the testing environment - that are not necessary and should not be there IMHO.

If the code runs locally, it should run successfully in the test runner. Is this a reasonable assumption?

Note: I don't think this is a super high priority, but I do think it's worth fixing.

kotp commented 1 year ago

Worth talking about, not yet positive it is broken.

The Gemfile has the intention of ensuring the same environment (to that point) is the same. Even the Ruby version can be specified there.

Since this repository is named "ruby test runner" then why do we think that this is not the testing environment? It surely must be.

Or are we talking about two different testing environment? The runner is from the perpective of Exercism, while the tests that the student should be using is less complex, without a gemfile, with only the requirement of Ruby and Minitest.

The maintainers have a simpler requirement as well, with only Ruby, Minitest, Rubocop, Rake, and SimpleCov.

joshgoebel commented 1 year ago

It surely must be.

Well, It's not the local test environment. Far from it - it seems quite different. There preferably would be a single test environment. Consistency is key. I do think it's reasonable to pin a Ruby version of course...

My premise again:

If the code runs locally (on same Ruby version), it should run successfully in the test runner.

It's not a good experience for a student when this goes sideways... actually I saw this happen with JS once, so that's perhaps something I might track down again now that I've raised this issue.

Or restated:

The test runner should not [appear to] BREAK good code that passes all tests green locally.

I think it's fine if we want to write the test runner itself in Ruby and use 100 gems (for convenience), but when it goes to run the actual test suite it should be doing so WITHOUT all those gems loaded or even in the LOAD_PATH... perhaps running in a subshell, etc...

I do think this would be a good general policy to decide on too - or at least a "default guideline"... again, the key concern being reproducible behavior between a "default install" of a language locally and our test running environs.

kotp commented 1 year ago

OK, so there would need to be some things defined. Such as "default install". What does that mean with Ruby?

I agree that if you decide to define a class that is being used on the testing environment for the platform locally, then it will pass locally, while it will not pass there. But the student can decide that they want to explore the latest and greatest, or any gem they want to, or any library they want to. We should not have to guard against this. However, noting that on the platform this is the environment that we run, and your locally running code that you have written may conflict against this, would be good information to have.

I guess what I am saying is that there is a local test environment, the student controls, and then we have the maintaining test environment, and the platform testing environment. The latter two should be duplicated. We can not control the student, but we can inform them. The irritation comes from the exercism inconsistencies where the maintainers may have passing example code, but they have introduced a conflict with the test runner, breaking it.

I do not think that the premise is fair for "If the code runs locally, even the same Ruby version, it should run successfully in the test runner" necessarily, but it is fair to let the students know.

I have ran against this where Minitest is modified and the spec version is not carried through, and so have had to work around that. (It used to work, a change was made where it broke on the platform, yet it would work fine locally.)

There are other situations as well where the program could run fine locally, but some exploration was done that makes it conflict with what is ran on the platform will be in conflict of the supporting tools to run those tests, or for reporting. Or even not be available to the test runner, such as a student wanting to include a third party library to explore, and we will not load it for various reasons, one of which may simply be that it is not available to retrieve it.

The message that gets reported, when this happens, is something like "The tests have failed to run, this could be a problem on our end, or maybe it is the submitted code." (to paraphrase), and while this is irritating for those cases where students have not tried something novel, it is accurate for the times when they inadvertently stomp on something that we rely on. And it may be our fault for not making that clear for the student. Or it may be OK, if the student accepts that.

On the platform side, though, we can probably do better to check that the things that we rely on are protected. Knowing that Parser is there, and being used, then we could, as a malicious student, modify that behavior to try to do some damage to the system that is running.

So there are security aspects to consider.

So had you been informed that Parser is being used to support a subsystem of the test runner on the platform, would you have made a different decision? Is the solution documentation so the student and maintainers know what not to stomp on?

joshgoebel commented 1 year ago

need to be some things defined. ... "default install". What does that mean with Ruby?

Good point. I'm personally used to rbenv install personally, which seems to give you the bare minimum tooling that you mentioned before (ruby, irb, rake, gems, etc). I'm aware different distros package Ruby all sorts of ways.

We can not control the student,

I don't think I was suggesting that... just that our test runner should use the same minimal ruby blah_test.rb that we recommend (I think?) users use locally. If popular distros don't provide all the necessary pieces in one package we should probably document that and help users get the right stuff installed - although I think if this was a big issue we'd already know about it?

and then we have the maintaining test environment, and the platform testing environment.

I'm not sure what all your covering with "platform testing environment"... right now it seems to load "test runner tooling" into the test runner scope - right beside user code. Which leads to...

So there are security aspects to consider.

This is one singular great reason to run the tests in as minimal environment as possible. (since built-in [compiled] gems could hook into system API because they might have compiled C components)... the user can provide tons of their own Ruby code, but they can't provide C code that we would compile (that I'm aware of)... (of course we do this on other test runners)

But no reason the Ruby test runner can't be as secure as possible... (vs other test runners with a full compile pipeline)

Presumable (it's Ruby) students could write code that causes their tests to fully pass - and then those changes don't just break the runner then inject invalid/incorrect data back into the Exercism system itself...

I'm not saying we have to use Bash everywhere... just that the "test running platform/tooling" should run the tests themselves in an even more minimal sandbox - not inside the same runtime as the test runner itself.

exercism / ruby-test-runner

Why does test runner have unnecessary gems installed? (Parser) #59