dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.97k stars 4.03k forks source link

Proposal: Add polymorphic enumerated constants that can contain methods and support the strategy pattern #4436

Closed TyreeJackson closed 8 years ago

TyreeJackson commented 9 years ago

Problem

Switch statements based on enums can be sources of major issues and bugs when abused, which I've seen occur all to often. One system I was brought in to work on for one company had over 22000 switch statements with over 70,000 individual cases across them, mostly based on enums. Just one random enum I recall selecting from that system was used in over 35 different switch statements. When I checked to see if all of the enum's values had cases in all of the switch statements, I found that many were missing cases where there should have been some present. Everytime someone would introduce a new value to an enum and switch statement, things would break.

I recommended replacing all of the switch statements with a set of replacement types based on a generic base class (similar to one I recently called SubclassableEnum) that I use to create enum like functionality, that in addition to mapping different constant representations of the value, also encapsulates the strategy pattern. This solution has worked wonders in several places. However, writing the code to use such a class varies from somewhat clunky to very clunky (when implementing abstract SubclassableEnums derivatives) and is less than ideal. To give better parity on simplicity towards that of classical enums, language support would need to be added.

Proposal:

Add a new set of specialized System.EnumeratedStrategy<T1...TN> compile-time templates to support a new type of enum that represents multiple sets of constant values and that also provides an option to implement one or more interfaces.

For example:

public interface IMyEnum
{
    bool ContinueIfValid(object someArgument);
    void DoSomethingInteresting();
}

public enum MyEnum : EnumeratedStrategy<int, string>, IMyEnum
{
    Value1 = new MyEnum(1, "primary")
    {
        bool ContinueIfValid(object someArgument) { return false; }
        void DoSomethingInteresting() {}
    },
    Value2 = new MyEnum(2, "secondary")
    {
        bool ContinueIfValid(object someArgument) { return false; }
        void DoSomethingInteresting() {}
    },
    Value3 = new MyEnum(3, "tertiary")
    {
        bool ContinueIfValid(object someArgument) { return true; }
        void DoSomethingInteresting() { Value2.DoSomethingInteresting(); }
    } 
}

public class UseMyEnum
{
    public void WorkIt(MyEnum myEnum)
    {
        var gottenValue = new Object();

        if (myEnum.ContinueIfValid(gottenValue))
        {
            this.I_Take_An_Int(myEnum);
            this.I_Take_A_String(myEnum);
            this.I_Take_An_IMyEnum(myEnum);
        }

        myEnum.DoSomethingIntersting();
    }

    private void I_Take_An_Int(int someValue)
    {
    }

    private void I_Take_A_String(string someString)
    {
    }

    private void I_Take_An_IMyEnum(IMyEnum myEnum)
    {
    }

}

In the above example, IMyEnum defines an interface of methods that each enum is required to provide an implementation for.

The MyEnum is a specialized type of EnumStrategy<int, string> as indicated by the enum keyword in the type signature that informs the compiler that we are defining an enum that should specialize an EnumeratedStrategy template instead of a generic class. Each of the enumerated constants of the MyEnum enum represents two constant values of types int and string. MyEnum therefore may be passed in place of either an int or string. Additionally, MyEnum implements IMyEnum, so each enumerated constant must also provide an implementation to satisfy that interface and MyEnum may be passed in place of IMyEnum as well.

Since the MyEnum would be compiled down to represent constant values, the values of the type itself should behave like a normal enum as far as it's use as a constant is concerned. What this allows for then, in addition to being able to use the type anywhere a normal enum can be used, is chaining specialized types of EnumeratedStrategy<>.

For example:

public interface ITransportObject {}

public interface IMyAdapterEnum
{
    ITransportObject CreateTransportObject();
    Dictionary<String, Object> ReadValues(ITransportObject transportObject);
}

public enum MyAdapterEnum : EnumeratedStrategy<MyEnum>, IMyAdapterEnum
{
    Value1 = new MyAdapterEnum(MyEnum.Value1)
    {
        ITransportObject CreateTransportObject() { return null; }
        Dictionary<String, Object> ReadValues(ITransportObject transportObject) { return null; }
    },
    Value2 = new MyAdapterEnum(MyEnum.Value2)
    {
        ITransportObject CreateTransportObject() { return null; }
        Dictionary<String, Object> ReadValues(ITransportObject transportObject) { return null; }
    },
    Value3 = new MyAdapterEnum(MyEnum.Value3)
    {
        ITransportObject CreateTransportObject() { return null; }
        Dictionary<String, Object> ReadValues(ITransportObject transportObject) { return null; }
    } 
}

With this kind of support, it would be easier to implement strategy pattern based solutions that must map back to lists of constant keys or values and allow for requiring methods to be defined on enums in place of using switch statements. By doing this, you never have to worry about whether or not a new enum value being added to an enum will result in cases not being handled or missed.

See SubclassableEnum for a clunky, less capable version of what is requested in this proposal.

default0 commented 9 years ago

HaXe dealt with this by forcing switches over enums to be exhaustive. This forces people to add missing cases, which is, imho a much cleaner way to deal with bugs that crop up due to adding enum values and not checking for them in places. That being said, doing this for C# is obviously a breaking change, and cannot be done. However, something like a checked switch may serve the need (ie prevent bugs that crop up due to adding enum values) as well. An existing codebase having issues with that may simply convert all of their switch statements to checked switch statements (this can be automated by an analyzer easily - check if the switch expression resolves to an enum, if it does, convert to a checked switch). The semantics would be the same as that of switch, but it would be disallowed to add a default label, the expression of the switch would have to evaluate to an enum and all the values of the enum would need to be present as a case label, otherwise a compiler error occurs. Example:

enum A { Value1, Value2 }
void M(A a)
{
    checked switch(a)
    {
        case A.Value1:
            break;
        case A.Value2:
            break;
    }
}

If someone would modify the enum A like so:

enum A { Value1, Value2, Value3 }

The above switch would produce a compile time error, complaining that not all explicit enum values are present as a case label, thus preventing you from accidentally missing one of 300 switch statements you used that enum in.

Whether or not this is better/worse than the pattern you propose, I do not know, what I do know is that it is an alternative way to tackle the same source of bugs you are talking about, and is potentially much more lightweight than baking an entire pattern into the language.

Best Regards

TyreeJackson commented 9 years ago

@default0 I appreciate the alternative option, and I would certainly support and welcome a checked switch for those who feel that they must use an enum/switch combination.

However, this proposal also has the added advantages of supporting non-integral types (not just byte, sbyte, short, ushort, int, uint, long, or ulong) and also multiple concurrent values of different types. I feel that this still has merit. This is especially useful when an enum has two different but equal underlying meanings, for example an integer value in the database, but a string value, potentially with spaces, in the .net code, that is matched against values in some other source.

In any case, with or without the a language solution here, I'm still of the belief that switches of the likes I've seen make the code much harder to understand, and I strongly advocate against using them with large numbers of cases. Code with these kinds of switches are often not clean. Having an alternative strategy promoted by the language that also gives a mapping facility would be a nice to have at the very least.

HaloFour commented 9 years ago

This feels similar to Java's take on enums in version 5.0. There enum is basically syntactic candy for a convention of a class that only offers a specific number of named instances through static public fields. An enum in Java is forced to inherit from an Enum base class and as such cannot inherit from any other class, but it can provide its own methods and implement interfaces.

public interface Animal {
    String speak();
}

public enum Pet implements Animal {
    DOG("Otis") {
        public String speak() {
            return "woof!";
        }

        public Boolean likes(String food) {
            return "kibbles".equals(food);
        }
    },
    CAT("Milo") {
        public String speak() {
            return "meow";
        }

        public Boolean likes(String food) {
            return "fish".equals(food);
        }
    };

    private final String name;

    // The only instances allowed to be created are the instances listed above
    Pet(String name) {
        this.name = name;
    }

    // The enum can define and implement methods
    public String getPetName() {
        return this.name;
    }

    // The enum can define abstract methods to be implemented by each instance
    public abstract boolean likes(String food);
}

Pet pet = Pet.DOG;
Animal animal = pet;
System.out.println(animal.speak());
System.out.printf("The %s's name is %s.", pet.name(), pet.getPetName());
if (pet.likes("kibble")) {
    System.out.println("The pet likes kibble!");
}
TyreeJackson commented 9 years ago

@HaloFour Yes, the Java approach looks great. In fact, I modeled this proposal on a blend between it and the SubClassableEnum that I've been using in .Net for years in one form or another since around 2006. I believe that the Java version alone would work. The only thing that the generics add is type parameter information to provide some base class functionality to implicitly convert between the subclassed enum and the type arguments. Also, if the static dictionary properties [All, and AllValues], which I did not include in the proposal, from SubClassableEnum.cs were to be included then the type information would need to be declared to the base type somehow. That said, since this would all occur at compile time, the signature need not look like generic signatures.

Borrowing from your example, perhaps something like this would work:

public interface Animal {
    String speak();
}

public enum Pet : [int, string] implements Animal {  // <- has int and string as underlying types and implemented Animal interface
    DOG(1, "Otis") {
        public String speak() {
            return "woof!";
        }

        public Boolean likes(String food) {
            return "kibbles".equals(food);
        }
    },
    CAT(2, "Milo") {
        public String speak() {
            return "meow";
        }

        public Boolean likes(String food) {
            return "fish".equals(food);
        }
    };

    private final String name;

    // The only instances allowed to be created are the instances listed above
    Pet(int index, String name) : base(index, name) {
        this.name = name;
    }

    // The enum can define and implement methods
    public String getPetName() {
        return this.name;
    }

    // The enum can define abstract methods to be implemented by each instance
    public abstract boolean likes(String food);
}

private void doDinnerTime()
{
    Pet pet = Pet.DOG;
    Animal animal = pet;
    System.out.println(animal.speak());
    System.out.printf("The %s's name is %s.", pet.name(), pet.getPetName());
    System.out.printf("The %s's name is %s.", pet.name(), pet); // <- Pet implicitly casts to string
    if (pet.likes("kibble")) {
        System.out.println("The pet likes kibble!");
    }
    this.recordIndex(pet); // <- Pet implicitly casts to int
}

private void recordIndex(int petIndex) {}
HaloFour commented 9 years ago

@TyreeJackson

Wouldn't need special syntax for that, if Pet was just a normal class it could have static methods, including implicit conversion operator overloads.

Of course where we would need special syntax is in defining the class. enum is out as that means something else entirely, but perhaps something like a combination of the enum and class keywords together:

public enum class Pet : IAnimal {
    Dog(1, "Otis") {
        public string Speak() { return "woof!"; }
        public bool Likes(string food) { return string.Equals(food, "kibble"); }
    },
    Cat(2, "Milo") {
        public string Speak() { return "meow!"; }
        public bool Likes(string food) { return string.Equals(food, "fish"); }
    };

    Pet(int id, string name) {
        this.Id = id;
        this.Name = name;
    }

    public int Id { get; }
    public string Name { get; }

    public abstract bool Likes(string food);

    public static implicit operator int(Pet pet) {
        return pet.Id;
    }

    public static implicit operator string(Pet pet) {
        return pet.Name;
    }
}
TyreeJackson commented 9 years ago

@HaloFour While I agree that the implicit operators would work for a lot of cases, I'm concerned about cases where the enum like type would need to behave like a constant. For example, in attributes. The underlying type value would have to be resolved at compile time in order to work there. It is one of the known issues with my SubclassableEnum class that it cannot be used in any attribute or other place where a constant is required. That class uses implicit operators the exact same way as you have suggested. The type of expression contained within the implicit operator would have to be restricted to returning a constant in order for that to work I suspect and the compiler would essentially still have to do something other than emitting an implicit operator method.

gafter commented 8 years ago

https://www.google.com/patents/US7263687

sharwell commented 8 years ago

Prior art request: US7263687 (Object-oriented enumerated type facility)