dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.76k stars 3.99k forks source link

[Bug] Interpolated strings in attributes #9407

Closed p-e-timoshenko closed 8 years ago

p-e-timoshenko commented 8 years ago

Version Used: CS 6

Steps to Reproduce:

    class SomeClass
    {
        public const String FirstNameDef = "Pavel";

        [Required(ErrorMessage = "Field 'FirstName' is required.")]

        [Required(ErrorMessage = $"Field {nameof(FirstName)} is required.")] 
        // Error: An attribute argument must be a constant expression, 
        // typeof expression or array creation expression of an attribute parameter type

        [DefaultValue(FirstNameDef)]

        [DefaultValue($"{FirstNameDef}")]  
        // Error: An attribute argument must be a constant expression, 
        // typeof expression or array creation expression of an attribute parameter type

        public String FirstName {
            get { return _firstName; }
            set { _firstName = value; }
        }

        public String _firstName = $"{FirstNameDef}";
    }
AdamSpeight2008 commented 8 years ago

I don't think this is a bug, the string interpolation is seen as a function call. Wither it consists of constants only arguments, has no barring on how is treated by the compiler.

HaloFour commented 8 years ago

See #9212 and #194

In short, since interpolation involves the current culture which could be a custom ICustomFormatter even if your only interpolating constants the result is not predictable and must be evaluated at run time.

alrz commented 8 years ago

Followig those proposals, I think a culture independent form of string interpolation might be useful in these situations when a simple concat is enough.

markhurd commented 8 years ago

And that was the first method relating to string interpolation I added to my LinqPad extensions:

    // Interpolation invariant helper
    public static string Invariant(this IFormattable str)
    {
        return str.ToString(null, System.Globalization.CultureInfo.InvariantCulture);
    }

But it's still clearly not a constant expression.

p-e-timoshenko commented 8 years ago

I can't agree that this case isn't an unexpected behavior. The string interpolation syntax looks like strings. You expect similar behavior when you write "Field 'FirstName' is required." or $"Field 'FirstName' is required.".

I understand that the interpolated string $"..." converts to String.Format:

    static void Test_nameof(String name)
    {
        Console.WriteLine($"Test_nameof: {name}");
    }

    static void Test_CallerMemberName([CallerMemberName] String name = "")
    {
        Console.WriteLine($"{nameof(Test_CallerMemberName)}: {name}");
    }

The code converted by compiler to following code:

    static void Test_nameof(String name)
    {
        Console.WriteLine(string.Format("Test_nameof: {0}", name));
    }

    static void Test_CallerMemberName([CallerMemberName] String name = "")
    {
        Console.WriteLine(string.Format("{0}: {1}", "Test_CallerMemberName", name));
    }

I test it by ILSpy.

It may be added the following rules to string interpolation: 1) If there are no interpolation braces, it will be an ordinary string ($"abc" <=> "abc"). 2) If the constant expression contains nameof and other strings in interpolation braces, it's evaluated at compile time ($"Property {nameof(PropertyName)}" ==[during compilation]==> "Property PropertyName") 3) It should be present the short form of nameof operator in interpolation strings that the one will be evaluated by compiler during compilation.

$"'\@{MyNamespace.SomType}', '\@{SomType arg0}', '\@{arr[index]}'." 
//'MyNamespace.SomeType', 'SomType arg0', 'arr[index]'.
$"'\@{MyNamespace.{SomeType}}', '\@{{SomType} arg0}', '\@{{public }int {Prop}}'." 
//'SomeType', 'SomeType', 'public Prop'. The braces used for the selective conversion

There is no opportunity to transform part of a code to string by principle “as is” to track its further changes: “System.Math.PI”, “Type Property” or “Type Namespace.Class.Method(Type arg0, Type arg1, ...)”. Functional style of the operator is perhaps doubtful. I'd want to ask a question: what type of argument? But the question disappears, because already there are similar methods like “typeof”. In addition, after compilation the method-like operator is replaced by string. Perhaps, it would be more natural to expand functionality of the string interpolation for strict and literally transform parts of code to string with validating it at compile time.

 // It's similar preprocessor directive (#) for interpolation string. So It acts at compile time
$" The property name is '\#nameof{PropertyName}'."
// After compilation: " The property name is 'PropertyName'."
$" The constant expression evaluated at compile time is '\#eval{2+2}'." 
// After compilation: " The constant expression evaluated at compile time is '4'." 

$" The constant expression evaluated at compile time is \#lang{ru-RU}'\#eval{(2+2)/8}' \#lang{invariant}'\#eval{(2+2)/8}'\#lang{default}."
// After compilation: " The constant expression evaluated at compile time is '0,5' '0.5'."

$" The format provider may be changed by \#format{0.00, MyCustomProvider}{myFloatVariable + 0.5}\#format{}"
// After compilation: String.Format(" The format provider may be changed by {0}", (myFloatVariable + 0.5).ToString("0.00", MyCustomProvider))

By the way, if it would be added the #{} to the ordinal strings, then there is no need to write a dollar sign at the beginning of interpolation string:

"\#lang{ru-RU} '\#codeof{MyClass.myStaticFloatVariable + 0.5}' = \#{myStaticFloatVariable + 0.5}, \#eval{myConstantOf5 + 0.5}"
// After compilation: String.Format(new CultureInfo( "ru-RU" ), " 'MyClass.myStaticFloatVariable + 0.5' = {0}, 5,5", (myStaticFloatVariable + 0.5))
// or 
// String.Format(" 'MyClass.myStaticFloatVariable + 0.5' = {0}, 5,5", (myStaticFloatVariable + 0.5).ToString(new CultureInfo( "ru-RU" )))
// The compiler can optimize the CultureInfo instance places during compilation

In this case, the suggested behavior of the interpolated strings becomes more clear.

AdamSpeight2008 commented 8 years ago

@p-e-timoshenko
My analyser String Format Diagnostic is capable of informing the developer about this. Currently I'm in the initial stages of developing an updated date version. That uses a new implementation of the parser, that better handles malformed strings. The updated version will also include diagnostics covering interpolation strings.

jaredpar commented 8 years ago

Thanks for taking the time to report this issue. Unfortunately this behavior is by design. Interpolated strings involve function calls under the hood and hence are not constant values. This means they can't be included as attribute values or assigned to const local, field members.

alrz commented 8 years ago

@jaredpar Wouldn't it make sense to have a subset of string interpolation syntax to preserve constness of the string? For example $$"Method Name: {nameof(TestMethod)}" produces a constant string. Would come in handy for performance issues discussed at #9212.

gafter commented 8 years ago

@alrz that is not a constant string either. The string resulting from $$"Method Name: {nameof(TestMethod)}" can depend on the current culture, though it would be horrible for a culture to make it so.

HaloFour commented 8 years ago

@gafter

though it would be horrible for a culture to make it so.

That's insensitive to my Pig Latin heritage. :frowning:

Seriously, though, I do think that @alrz 's argument is that a double $$ would be some kind of strict interpolation mode, perhaps one where only const expressions are permitted. Although that doesn't seem like much of an improvement over just +.

alrz commented 8 years ago

@gafter but const string a = "TestMethod: " + nameof(TestMethod); is a constant ...

alrz commented 8 years ago

@HaloFour

Although that doesn't seem like much of an improvement over just +.

That is why we have string interpolation, which one do you prefer

"TestMethod: " + nameof(TestMethod) + " Whatever: " + whatever;

$"TestMethod: {nameof(TestMethod)} Whatever: {whatever}";

$$ could simply translate to + for interpolations and produce (possibly) constant strings.

In fact, I rarely use all the features of string interpolation like : and , I just want to merely "interpolate" most of the time. That's unfortunate to lose constness and introduce overhead for the sake of interpolation.

AdamSpeight2008 commented 8 years ago

If the target is const string then the interpolation string could enforce const-ness of the arguments. Then the result is a constant string rather than a string.

gafter commented 8 years ago

@AdamSpeight2008 The problem is that there is an implicit argument to string interpolation, which is the current culture. We cannot enforce that the current culture is a constant, because it isn't a constant.

tannergooding commented 8 years ago

@gafter, I believe he is implying that the where string x = $"TestMethod: {nameof(MyTestMethod)}" is converted to string x = string.Format(CultureInfo.CurrentCulture, "TestMethod: {0}", nameof(MyTestMethod)) (which is resolved at run time). We should have a new language feature so that string interpolation is pre-computed at compile time, so that string x = $$"TestMethod: {nameof(MyTestMethod)}"is converted to string x = "TestMethod: MyTestMethod" (which would , presumably, require the compiler to run string.Format based on the current system culture).

Edit -or, rather string x = $$"TestMethod: {nameof(MyTestMethod)}" is converted to string x = "TestMethod:" + nameof(MyTestMethod);, which would require the compiler to compute nothing.

tmat commented 8 years ago

The compiler can't assume anything about the implementation of string.Format, especially not that the platform it's running on is the same as the program it targets and thus the implementations of string.Format are the same.

alrz commented 8 years ago

I don't know what is the problem with translating interpolation to a bunch of string concatenation operators at the interpolation sites, in that case you may not be able to use culture dependant features like : etc. That's why I proposed another form (double-dollar) which preserves constness of the string.

HaloFour commented 8 years ago

@alrz Interpolation is always culture dependency because it always uses the current culture to format, even if you're only interpolating strings. Technically the culture could intercept and rewrite those strings to whatever value it wants, regardless of format specifier or other options. You'd need a new form of interpolation which is explicitly designed to concatenate or something of that nature. Or you'd need a built-in interpolation prefix, perhaps.

alrz commented 8 years ago

Still all the options above seem promising,

The thing is that if the user want a constant, he/she woudn't care what the culture is, and probably woudn't want to use culture dependant features. That's why you may be forced to use string concat in these cases for the sake of constness of the resultant string.

@tmat The compiler wouldn't assume anything about the implementation of string.Format at runtime, because as @tannergooding suggests, it will use the current system culture at the compile time and there are chances that you just want to merely use interpolation syntax to concat constant strings.

AdamSpeight2008 commented 8 years ago

Why can't the assumed culture at compile-time be CultureInfo.InvariantCulture?

gafter commented 8 years ago

@AdamSpeight2008 The defined semantics of an interpolated string is that they use the culture at runtime, not at compile-time.

p-e-timoshenko commented 8 years ago

@tannergooding "" - in the beginning there were ordinary string, $"", $$"", $$$"", ... )

@AdamSpeight2008 "The idea of this preprocessor directive is to enforce the culture that is to be used be default." Once more, It would be grateful to add such useful C++ preprocessor directive as following:

define MACRO_NAME ...

By the way, the C ++ language flew up for the last time and has surpassed C# (http://www.tiobe.com/tiobe_index)! I consider that it's desirable to avoid the growing of amount of preprocessor directives. The directives like bull in a china shop.

The ordinal strings is defined by quotes - "". The backslash means special character named escape sequence (https://msdn.microsoft.com/library/h21280bw(v=vs.110).aspx). Also there is s string @"" that supressed escape sequence. Since C#6 there is string interpolation - $"". It is similar the syntax sugar of System.String.Format method. It's a good idea. But the using of dollar sign isn't very well.

String.Format($"{0}", 1); // Output: 0
String.Format($"{{0}}", 1); // Output: 1
String.Format($"{{{{0}}}}", 1); // Output: {0}

Let's see what will be if the dollar sign is absent.

"..." // ordinary string
// After compilation: "..." 

"...\n\t..." // ordinary string with escape characters
// After compilation: "...\n\t..."

"...\#..." // Error: Unrecoginzed escape sequence
// In C ++ the hash symbol associated with preprocessor directives.
// Therefore, it can be added to strings as string preprocessor directives.
// \#simple-string-preprocessor-directive# or \#string-preprocessor-directive-with_argument{}
// RegEx: \\#([A-Za-z][A-Za-z0-9-_]*(?=#|{|[^A-Za-z0-9-_]))#?(\{([^{}]|(?R))*\})?

" The property name is '\#nameof{PropertyName}'."
// After compilation: " The property name is 'PropertyName'."
// Older versions of .NET will generate an error: Unrecoginzed escape sequence

" The property value is '\#eval{1f/2f + 1f/3f}'."
// After compilation: " The property value is '0.8333333'."

"\#culture{ru-RU} The property value is '\#eval{1f/2f + 1f/3f}'."
// After compilation: " The property value is '0,8333333'."

"The property value is '\#culture{ru-RU}\#eval{1f/2f + 1f/3f}'."
// After compilation: " The property value is '0,8333333'."
// the directive "culture" affects to the next directive only if it is written in succession \#dir1\#dir2
// The directive "eval" tries to convert to string at compile time.
// What couldn't be converted, it makes the argument of String.Format

"The property value is '\#culture{ru-RU}\#eval{1f/2f + 1f/3f  + someVar}'."
// After compilation: String.Format(" The property value is '{0}'.", (1f/2f + 1f/3f  + someVar).ToString(new CultureInfo("ru-RU")))

// You can forcibly to make value as a parameter of String.Format:
"The property value is '\#culture{ru-RU}\#insert{1f/2f + 1f/3f + someConst}'."
// After compilation: String.Format(" The property value is '{0}'.", (1f/2f + 1f/3f + someConst).ToString(new CultureInfo("ru-RU")))

"The algebraic expression '\#codeof{0.5f + 1f/3f}' equals '\#culture{ru-RU}\#eval{1f/2f + 1f/3f}'."
// After compilation: " The algebraic expression '0.5f + 1f/3f' equals '0,8333333'."
HaloFour commented 8 years ago

@p-e-timoshenko Sorry, the ship has sailed regarding the syntax and semantics for string interpolation. Those debates all played out on CodePlex. Notably the original syntax did not use a prefix and used a form of escape sequence within normal strings, and the consensus was that this was not a good idea.

p-e-timoshenko commented 8 years ago

@HaloFour https://roslyn.codeplex.com/discussions/570614 https://roslyn.codeplex.com/discussions/570292 https://roslyn.codeplex.com/discussions/577567 https://roslyn.codeplex.com/discussions/540869 (Q. Hole syntax?) https://roslyn.codeplex.com/discussions/546467

..."The only characters that require escaping are ", {, and } which are escaped by doubling Useful because \ is a common use case, and it makes the interpolated holes easier to read."...

Of course, the escaping braces ("{variable}") is difficult to read. For example "Hello {person.name}, you were born on {person.dob:D}. {{person.age} years ago!}". How difficult is it to read my examples?

HaloFour commented 8 years ago

Quite, but it doesn't matter because string interpolation already shipped, the syntax won't be subject to breaking change, and you'd have to demonstrate a really big reason why another form of interpolation is worth implementing.

alrz commented 8 years ago

@HaloFour

I think @jods4's argument in #9212 could be "a really big reason" to introduce that kind of interpolation in the language and it doesn't interfere with anything that already been shipped.

I feel sad that C# introduces clean, attractive new features that devs quickly adopt (and are sometimes suggested by refactoring tools such as Resharper) to later find out that in perf. critical code you should avoid them

Preserving constness is another "big reason" for doing this.

To not introduce another form I'd be ok with context-sensitive translation e.g. allowing interpolation for const targets and in attributes with translating it to string concatenation and not allow format specifiers in these contexts, but a new form could be useful in other places that you don't want to add overhead.

HaloFour commented 8 years ago

Possibly, but it's not like any of these limitations are a surprise. The issues of culture and performance were all already debated and C# 6 shipped with the result of those conversations. String interpolation was not designed to fit the scenarios described here, and that was intentional.

alrz commented 8 years ago

but it's not like any of these limitations are a surprise

I think this issue being filed as a "bug" is consequence of a surprise.

The issues of culture and performance were all already debated and C# 6 shipped with the result of those conversations

The culture dependance of string interpolation is totally understandable but inability to use it in other contexts when it makes sense is not. It's like await in catch, however it doesn't restrict anything in these contexts, but you must use constant strings in attributes already.

String interpolation was not designed to fit the scenarios described here

I'm saying that you should be able to change the "old style" to new idomatics transparently, for example

[Required(ErrorMessage = "Field 'FirstName' is required.")]
// is equivalent to
[Required(ErrorMessage = "Field '"  + nameof(FirstName) + "' is required.")] 

But suddenly

[Required(ErrorMessage = $"Field '{nameof(FirstName)}' is required.")] // ERROR!!!
HaloFour commented 8 years ago

I think this issue being filed as a "bug" is consequence of a surprise.

I meant by the design team, who have already closed this issue since this behavior is by design.

I'm not disagreeing (or agreeing) with these arguments, I'm only pointing out that they were already made while this feature was still being designed. The syntax and behavior changed several times due to community feedback. What shipped is the result from those collaborations same last year. I'd be pretty shocked if the team were willing to revisit those same arguments or to consider another syntax for interpolation. But weirder things have probably happened.