exercism / problem-specifications

Shared metadata for exercism exercises.
MIT License
326 stars 541 forks source link

Issue with Allergierss #1636

Open BadKarmaZen opened 4 years ago

BadKarmaZen commented 4 years ago

A student posted the following solution to the Allergies exercise (or something along this lines)

public class Allergies
{
    private readonly int _allergy;

    public Allergies(int mask)
    {
        _allergy = mask;
    }

    public bool IsAllergicTo(Allergen allergen)
    {
        return 1 == ((_allergy >> (int)allergen) & 1);
    }

    public Allergen[] List()
    {
        return Enum.GetValues(typeof(Allergen))
            .Cast<Allergen>()
            .Where(IsAllergicTo)
            .ToArray();
    }
}

I believe this solution to be incorrect, but it passes all test. The problem i see is that the IsAllergicTo function works only in the case that the enum is left untouched.

public enum Allergen
{
    Eggs,
    Peanuts,
    Shellfish,
    Strawberries,
    Tomatoes,
    Chocolate,
    Pollen,
    Cats 
}

I believe that this is caused by not using the enum values in the test but hardcoded values. The following Test failed.

[Fact]
  public void Using_enum_values()
  {
    var sut = new Allergies((int)Allergen.Eggs +  (int)Allergen.Shellfish);
    var expected = new[] { Allergen.Eggs, Allergen.Shellfish };
    Assert.Equal(expected, sut.List());
  }
yawpitch commented 4 years ago

I'm not sure there's anything that can be done in this repo for this, as the problem of how to represent those values is entirely up to each language track. There is no bitfield / enum type in JSON, so we're limited to representing the values as either integers or strings; it's assumed the tracks will provide a dummy file with an appropriate enumerated type for the student to work from.

ErikSchierboom commented 4 years ago

the problem of how to represent those values is entirely up to each language track

Well yes, but the tests do enforce their values: https://github.com/exercism/problem-specifications/blob/master/exercises/allergies/canonical-data.json

This means that it might make sense to have one test case for the above-mentioned combination, as I think will be several languages that implement it the same way C# did.

yawpitch commented 4 years ago

Admittedly I can barely read C#, but isn't that failing test case already covered by the "more than eggs but not peanuts" test, in combination with the two tests that establish that the value of eggs is 1 and shellfish is 4? I'm happy to entertain a change here, I'm just not seeing what that proposed new test case (or test cases) would be.

ErikSchierboom commented 4 years ago

I'm not sure either :D

BadKarmaZen commented 4 years ago

The problem is that the test do not really enforce the values provided in the instruction set. At least the tests in C#. The test do not enforce the value of shellfish to equal 4, it just reques if alleric to shelffish the IsAllergicTo function needs to return true. This is a subtle difference. The question is whether this is the goal or not. All the test use magic values see the following test

    public void More_than_eggs_but_not_peanuts()
    {
        var sut = new Allergies(5);
        var expected = new[] { Allergen.Eggs, Allergen.Shellfish };
        Assert.Equal(expected, sut.List());
    }

is using 5 as input and expects a list of 2 Enum values { Allergen.Eggs, Allergen.Shellfish } this is function mapping from Integers to collection of Allergen values. The test do not check the way back; Look at the following test;

   public void Allergic_to_peanuts_are_you_sure()
  {
    var first_test = new Allergies(2);
    Assert.True(first_test.IsAllergicTo(Allergen.Peanuts));     //  Test A: passes
    Assert.True(first_test.IsAllergicTo((Allergen)2));          //  Test B: doesn't pass

    var second_test = new Allergies((int)Allergen.Peanuts);
    Assert.True(second_test.IsAllergicTo(Allergen.Peanuts));    //  doesn't pass    
  }

The first_test.A passed, magic numbers to enum, but first_test.B doesn't pass Also the second_test doesn't pass, because there is no correlation from enum back to number. The last test really shows the subtleness of the problem

  public void More_than_eggs_but_not_peanuts_double_verification()
  {
    var magic_number = 5;
    var sut = new Allergies(magic_number);
    var expected = new[] { Allergen.Eggs, Allergen.Shellfish };
    var allergies = sut.List();
    Assert.Equal(expected, allergies);

    var mask = 0;
    for (int index = 0; index < allergies.Length; index++)
    {
      var allergy = (int)allergies[index];
      mask = mask + allergy;
      Assert.True(sut.IsAllergicTo((Allergen)allergy));
    }

    Assert.Equal(magic_number, mask);
  }

Only the final test Assert.Equal(magic_number, mask); fails meaning that the 5 is really created by the folling magic 5 = 2^Allergen.Eggs + 2^Allergen.Shellfish and not 5 = Allergen.Eggs + Allergen.Shellfish. I hope that this explains what i mean.

yawpitch commented 4 years ago

Somewhat ... it doesn't really clarify what we can do about it. Each of the allergen names is associated within the canonical data with a specific value, and each value is associated with a specific name ... there being no real way to store an enum any other way in JSON we've been leaving it up to the tracks to decide how to actually implement it ... so if C# would benefit from these values being explicitly locked in an Enum that needs to be provided in a an appropriate file in the track's implementation that the student either receives with instructions not to change, or responding to this will just kind of have to be a mentor issue.

Or we need to figure out a way of encoding enums effectively into JSON.

ErikSchierboom commented 4 years ago

I think in this case we'll probably just have to update the C# tests then.

BadKarmaZen commented 4 years ago

Am I correct in stating that we prefer the option 34 = 2(peanuts = 2) + 32(chocolat = 32) over 34 = 2^1(peanuts = 1) + 2^5(chocolate)? If this is true, can't we create a test that checks this? For instance the canonical case

        {
          "description": "not allergic to anything",
          "property": "allergicTo",
          "input": {
            "item": "peanuts",
            "score": "peanuts"
          },
          "expected": false
        },

results in

[Fact]
  public void Using_enum_values()
  {
    var sut = new Allergies((int)Allergen.Peanuts);
    Assert.True(sut.IsAllergicTo(Allergen.Peanuts));
  }

This way we create a loopback from the enum value which reduces the chance to choose the more complect psoibility. Again I'm very new to this, i have only be a mentor for a week, so I don't know what the posibilities are.

yawpitch commented 4 years ago

Am I correct in stating that we prefer the option 34 = 2(peanuts = 2) + 32(chocolat = 32) over 34 = 2^1(peanuts = 1) + 2^5(chocolate)?

This is where we have to get careful with terminology and operators ... the item peanuts is assigned the value 2 and chocolate is assigned the value 32. Assuming that you're using x ^ y to mean "x raised to the power of y" (and not "x bitwise XOR with y", as it means in Python) then I'm not sure what you're asking, because the 34 = 2 + 32 is the same thing as 34 = 2 ^ 1 + 2 ^ 5. Where you've got (peanuts = 1) I'm lost, because peanuts is never assigned to 1 in the tests, so I'm not entirely clear on what you're asking. Each item is assigned a name that is constrained by the other tests to be an integer value that is a power of 2, and the score is an integer that will (inevitably) be the sum of sum arbitrary number of unique powers of 2 ... the "preferred" solution is one that uses bitwise operators to check that item BITWISE AND score is 1.

If this is true, can't we create a test that checks this? For instance the canonical case

        {
          "description": "not allergic to anything",
          "property": "allergicTo",
          "input": {
            "item": "peanuts",
            "score": "peanuts"
          },
          "expected": false
        },

Sorry, confused again ... it doesn't make sense for "not allergic to anything" to be False if item is "peanuts" and score is the numeric value of peanuts (2) ... you're wanting to set them both to the name of the item (ie a string)? That would seem to only make translating the canonical data to a track implementation harder, but again I'm unclear on what you're trying to communicate here.

[Fact]
  public void Using_enum_values()
  {
    var sut = new Allergies((int)Allergen.Peanuts);
    Assert.True(sut.IsAllergicTo(Allergen.Peanuts));
  }

I'm not entirely clear on how this is any different than the existing canonical test that checks item of "peanuts" and score of 2.

BadKarmaZen commented 4 years ago

In the solution I provided in the beginning of this issue, the values are as follow;

Using these value, together with the provide solution, all tests pass. The question is whether this is correct or not.

In your answer you state that in none of the tests Peanuts is assigned the value 1, but in fact assigned to the value of 2. But being assigned doesn't mean being equal. To be honest the test don't really care what the value of the item peanuts is, There is a solution possible where peanuts is equal to the value 1 and chocolate has the value 5. Another solution is possible where peanuts has the value of 2 and chocolate has the value of 32.

Since both solution passes all the test, either all both solutions are correct, or the tests need to be improved to guide the student to the version we prefer. I believe by adding the proposed test we can guide the student towards the solution that peanuts is equal to 2.

yawpitch commented 4 years ago

Ok, thanks for the extra detail, that helps me better understand your issue. However I still don't think this repo is the place to address it.

Essentially this repo's purpose is to provide common data that describes the tests known to be necessary to say that a solution is minimally correct ... if your user can write a solution for the tests you've created from that data that gets the correct answer (in this case ["peanuts", "chocolate"] when given the argument 32) then, fundamentally, their solution is minimally correct.

Now, you're right, the tests in C# apparently don't care what the value assigned to "peanuts" actually is ... they haven't taken that knowledge and described it to the student in a way that incentivizes a powers of 2 solution as "more" correct than an index-based solution ... but that's fine, if the track doesn't hold a strong preference about the desired implementation. If they do then the simplest alternative is to provide a dummy file that has the appropriate

The usual preference is to prefer a bitwise operator / bitfield solution, but that's because this is one of the few exercises that affords that opportunity to teach that concept in a concise manner ... it's not a requirement though unless the track chooses to state that it's a requirement of their students.

So, to answer this point:

Since both solution passes all the test, either all both solutions are correct, or the tests need to be improved to guide the student to the version we prefer.

Yes ... both are equally correct, and if the C# track wants to move the burden of stating the track's preference from the mentor to the tests then, unless I'm missing something, the track has what it need to build those tests.

However I'd still tend to leave this in the mentor / mentee space, because both approaches are roughly equivalent, logically ... I mean you're either using bitwise operations to look up the value at a particular bit in bitfield or you're using indexing and arithmetic to look up the value at a particular index in an array. A student who solves this with an array lookup is correct, they're just not very efficient. Personally I think that's the point at which a mentor's advice will come in most helpful.

That said if you want to make a PR with your proposed new test(s) in it, that's fine, as I don't think it would break any other track's test generation; #1560 will block it from being merged for the moment, as it's not a bug fix, but we'd get it incorporated eventually.

ErikSchierboom commented 2 years ago

Being a C# maintainer, I think what we could do is to write an analyzer to check if the student uses bit values for the enums. I don't think this is a cross-track issue, which is why I think this issue can be closed. If anyone disagrees, feel free to re-open.