ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.01k stars 2.56k forks source link

Vertical tab escape sequence considerd not valid. #21564

Open jfalcon opened 1 month ago

jfalcon commented 1 month ago

Zig Version

0.13.0

Steps to Reproduce and Observed Behavior

  1. In an empty folder, run zig init.
  2. Edit the src/main.zig file to include the following: const whitespace = [_]u8{ ' ', '\t', '\v', '\n', '\r' };
  3. Run zig build on the project.

There's an error returned for '\v' saying: error: invalid escape character: 'v'. It's also not listed as valid in the spec.

Expected Behavior

The v escape sequence should be considered valid, as it is in languages such as C and Python. It represents a vertical tab, which like a carriage return isn't used too much today. But, there are some applications that still use this in place of a newline character. So, it's worthwhile including it as a valid escape sequence alongside a horizontal tab.

nektro commented 1 month ago

given its infrequency, what benefit would this provide over using 0xb or 11 ?

Paul-Dempsey commented 1 month ago

re "some applications that still use this in place of a newline character" And other applications use another character - there are many options. Not really a compelling argument IMO. I suppose it's a violation of the principle of least surprise, but I don't mind the reminder that Zig isn't language-I-am-used-to.

jfalcon commented 1 month ago

given its infrequency, what benefit would this provide over using 0xb or 11 ?

  1. Semantics consistent with the rest of the whitespace characters that use a mnemonic.
  2. Given C-compatibility is a concern, compatibility with C (and every other language).
  3. It's still widely known whitespace escape character, so intentionally not supporting it seems more like an oversight than being streamlined.
jfalcon commented 1 month ago

re "some applications that still use this in place of a newline character" And other applications use another character - there are many options. Not really a compelling argument IMO. I suppose it's a violation of the principle of least surprise, but I don't mind the reminder that Zig isn't language-I-am-used-to.

It is a compelling argument given the fact that just about every other language supports this escape sequence mnemonic, because it is in fact still used. I gave an example of how, so saying just because other apps do something different to delimit is not a compelling rebuttal when some apps do use it.

That's like saying some people eat strawberry ice cream so we should get rid of vanilla, despite some people still eating vanilla.

So, please elaborate on why Zig should intentionally ignore something every other language supports?

mocompute commented 1 month ago

Honestly this feels like an innocuous oversight rather than an intentional omission. There is a simple workaround but it adds friction to people importing strings from other languages, eg https://en.cppreference.com/w/cpp/language/escape

jfalcon commented 1 month ago

Honestly this feels like an innocuous oversight rather than an intentional omission. There is a simple workaround but it adds friction to people importing strings from other languages, eg https://en.cppreference.com/w/cpp/language/escape

Exactly what I was thinking. Totally an easy boo-boo to make.

wooster0 commented 1 month ago

ok what about \a, \b, \e, \f and all the other ones? Why add this one (and not the others too)? \e and \a are very useful for programs interfacing with the terminal. Which escape sequence is useful to you depends on what you're doing so I think either we add those too or leave the escape sequences simple and memorable as they are right now. In fact I wouldn't mind if \r gets removed

jfalcon commented 1 month ago

ok what about \a, \b, \e, \f and all the other ones? Why add this one? \e and \a are very useful for programs interfacing with the terminal. Which escape sequence is useful to you depends on what you're doing so I think either we add those too or leave the escape sequences simple and memorable as they are right now. In fact I wouldn't mind if \r gets removed

I gave 3 reasons why it should be added. Please read the replies and address that, thank you.

As far as \r being removed, that's not going in the proper direction for a language. Just because OSes like Windows don't rely on it as much these days, doesn't mean it's still not used. Anyone who's ever worked with the HTTP protocol will know.

Also, we could talk about human nature vs adoption rate having a strong correlation with the amount of changes involved with said adoption, but we run the risk of entering into over thinking territory when the non-tangibles surface without addressing the points made.

As stated, C interoperability is a design goal of Zig. About the only valid concern against it I could see would be if there were speed implications. I know Andrew has mentioned a willingness to make cuts in order to keep Zig fast, and I'm not certainly not talented enough in this area to accurately determine if this would have an impact or not. But, I don't see how it would given the fact it's so trivial to implement.

As far as which escape sequences to add, I think the ones C/C++ use make a good foundation. As mocompute eluded to. If this discussion turns into whether or not \e should be added, that's awesome and IMO a great evolutionary step. But for certain we can at least start with talking about the ones C supports.

andersen commented 1 month ago

As far as which escape sequences to add, I think the ones C/C++ use make a good foundation.

That would mean adding \f, \b, \v, and (probably) \a (a C89 addition) before considering \e.

\f likely represents the most commonly used of these control characters and is also supported in, e.g., Java and JSON alongside \b. Is there an argument for adding \v without \f and \b or should the minimal version of this proposal be considered to include all those three?

jfalcon commented 1 month ago

Is there an argument for adding \v without \f and \b or should the minimal version of this proposal be considered to include all those three?

No argument for \v without the others. \v was just the first one I noticed. Great points about \f and \b being included as a minimum.