dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.81k stars 4.61k forks source link

[API Proposal]: Console: .ReadBool, .ReadInt, .ReadLong and so on #64621

Open AlexGames73 opened 2 years ago

AlexGames73 commented 2 years ago

Background and motivation

Many lovers of C# like me used console class for programming competitions and faced to a problem like 'TimeLimit' or 'OutOfMemory' verdicts. It is related to very slowly reading from console because console reads whole line in RAM (like in Python), it is their main problem at all in programming compretitions. I want to extend console class with 'read' methods (like 'cin' in C++), which will significantly optimize reading from the console and increase the popularity of using the language in competitions.

API Proposal

namespace System
{
    public static class Console
    {
        public static string ReadToken(params char[] skipChars)
        {
            var hashSet = skipChars.ToHashSet();
            var c = (char) In.Read();
            while (hashSet.Contains(c))
            {
                c = (char) In.Read();
            }

            var sb = new StringBuilder();
            while (!hashSet.Contains(c))
            {
                sb.Append(c);
                c = (char) In.Read();
            }

            return sb.ToString();
        }

        public static string ReadToken() => ReadToken(' ', '\n', '\r');
        public static bool ReadBool() => bool.Parse(ReadToken());
        public static decimal ReadDecimal() => decimal.Parse(ReadToken());
        public static double ReadDouble() => double.Parse(ReadToken());
        public static float ReadFloat() => float.Parse(ReadToken());
        public static int ReadInt() => int.Parse(ReadToken());
        public static long ReadLong() => long.Parse(ReadToken());
    }
}

API Usage

var n = Console.ReadLong();
var arr = new int[n];
for (var i = 0; i < n; i++)
{
    arr[i] = Console.ReadInt();
}

Alternative Designs

No response

Risks

No response

dotnet-issue-labeler[bot] commented 2 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

AlexGames73 commented 2 years ago

I tag area-System.Console command: @jeffhandley @adamsitnik @jozkee

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-console See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation Many lovers of C# like me used console class for programming competitions and faced to a problem like 'TimeLimit' or 'OutOfMemory' verdicts. It is related to very slowly reading from console because console reads whole line in RAM (like in Python), it is their main problem at all in programming compretitions. I want to extend console class with 'read' methods (like 'cin' in C++), which will significantly optimize reading from the console and increase the popularity of using the language in competitions. ### API Proposal ```C# namespace System { public static class Console { public static string ReadToken(params char[] skipChars) { var hashSet = skipChars.ToHashSet(); var c = (char) In.Read(); while (hashSet.Contains(c)) { c = (char) In.Read(); } var sb = new StringBuilder(); while (!hashSet.Contains(c)) { sb.Append(c); c = (char) In.Read(); } return sb.ToString(); } public static string ReadToken() => ReadToken(' ', '\n', '\r'); public static bool ReadBool() => bool.Parse(ReadToken()); public static decimal ReadDecimal() => decimal.Parse(ReadToken()); public static double ReadDouble() => double.Parse(ReadToken()); public static float ReadFloat() => float.Parse(ReadToken()); public static int ReadInt() => int.Parse(ReadToken()); public static long ReadLong() => long.Parse(ReadToken()); } } ``` ### API Usage ```C# var n = Console.ReadLong(); var arr = new int[n]; for (var i = 0; i < n; i++) { arr[i] = Console.ReadInt(); } ``` ### Alternative Designs _No response_ ### Risks _No response_
Author: AlexGames73
Assignees: -
Labels: `api-suggestion`, `area-System.Console`, `untriaged`
Milestone: -
Frassle commented 2 years ago

I suppose there could be an argument for dotnet including something like Java's Scanner class. But it shouldn't be specific to Console.

huoyaoyuan commented 2 years ago

There can be a class parsing such values from any TextReader. It will serve like fscanf in C.

madelson commented 2 years ago

Is this useful outside of coding competitions? I wonder if this could just be a NuGet package.

tannergooding commented 2 years ago

Also worth noting that if something like this were taken then ReadBool, ReadInt, ReadLong, ReadFloat, etc are all the incorrect name: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/general-naming-conventions#avoiding-language-specific-names

As per the docs, the name is Boolean, Int32, Int64, Single, etc. All matching the name of the corresponding type in System

AraHaan commented 2 years ago

As for reading the things, I would rather have them use the type's TryParse methods so that way if they cant be decoded properly they can throw directly from the Console class itself.

I also agree with tanner, the names would have to be ReadBoolean, ReadInt32/ReadUInt32, ReadInt64/ReadUInt64, ReadSingle, etc.

AlexGames73 commented 2 years ago

I agree with you all that naming is important thing, but unfortunately, class Console prohibits the use of UInt32 and UInt64 as return type of methods.

Also i think that TryParse method (may be generic method) are not about reading datas, it is about parsing something from parameter of method.

I was thinking more about creating methods ReadBoolean, ReadInt32, ReadInt64, ReadSingle, etc. (ReadUInt32 and ReadUInt64 only if prohibits will have been declined), or about creating separate api like Scanner in Java.

AraHaan commented 2 years ago

I am pretty sure that UInt32 and UInt64 is ok anywhere I think where it is needed if there is no other way to represent something). For me, perhaps I want it to directly read in some unsigned number someone imputted into console which cannot be downcasted to a signed one as it might overflow (which is sometimes not good at all).

Besides, if we are going to do this much changes into Console for additional Read methods, we might just as well go all the way with the built in types to C#.

epeshk commented 2 years ago

Programming competition judge systems disallows any external references. Solution on competitions usually is just a single file of code. So it's useless to have this API in a NuGet package.

Of course, this shouldn't be limited to Console.

i think it's nice to have these APIs even if only practical use case are competitions, because:

  1. It's not only about performance. .ReadInt32() is simpler to write and read than int.Parse(Console.ReadLine().Split()[0]). Also these APIs will be useful when input format is not strict about whitespaces and line breaks.
  2. It will simplify participating in programming contests for .NET developers
  3. It will helps to promote .NET to competitive programmers
  4. It will force developers of judge systems to upgrade to modern versions of .NET and languages from .NET Framework and Mono (because of participant's requests to supports these APIs)
tannergooding commented 2 years ago

Programming competition judge systems disallows any external references

It is not the job or role of the BCL to appease limitations of programming competitions. Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core". Even for the case of something like the math APIs for C/C++, some implementations (Unix) have it be an explicit separate reference (libm).

That doesn't mean it shouldn't or couldn't be included; its just not something that we typically consider as a driving reason to do it.

epeshk commented 2 years ago

Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core".

On other side C++/Java/Go/Kotlin have APIs like this.

BCL already not so small, e.g. includes JSON parsing APIs. Plain-text parsing is useful for competitions as JSON parsing useful for web apps and microservices.

adamsitnik commented 2 years ago

System.Console already exposes 89 public method and properties. I don't believe that we should add more, especially since we might introduce a new Terminal oriented abstraction for it (https://github.com/dotnet/runtime/issues/52374).

Another thing are delimiters. In the provided sample implementation the code handles a space and a new line. But how about tab? How about other delimiters?

What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?

epeshk commented 2 years ago

How about other delimiters?

For typical programming contest input any whitespace char (char.IsWhitespace) is good delimiter IMHO. Don't know about other use cases

expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"

FormatException if it is treated as Int32/Boolean/etc and "123true456false" for ReadToken (naming from original proposal)

epeshk commented 2 years ago

For me, personally, it's so sad to see how students give up C# to C++ and Java for competitions only due to lack of these APIs.

No, it's not possible to fix judge systems, because there are too many of them.

Only known workaround is using prewritten boilerplate code, but it's not always allowed on on-site contests (where all code must be written at the contest time)

AlexGames73 commented 2 years ago

What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?

This case is incorrect, since even you will not be able to answer the question of what tokens are present here (123/true/456/false [Int32/Boolean/Int32/Boolean] or 123t/rue/456/false [String/String/Int64/Boolean]). Of course, we can make method based on regex expressions, but it will be slower than usual...

AlexGames73 commented 2 years ago

By the way, this API will help companies to test hired employees to .NET Developer position, because automatic testing systems less wasteful and more efficiency than "interview tasks".

epeshk commented 2 years ago

this API will help companies to test hired employees to .NET Developer position

They could ask to write a method with a prewritten signature.

davidfowl commented 2 years ago

Oh we want the scanner class from Java https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html. This shouldn't be tied to the console APIs.

PS: This is very useful in programming contests when reading input 😄 . FWIW it's still painful in C#.

C++

int a, b;
cin >> a >> b;
Scanner scanner = new Scanner(System.in);
int a, b;
a = scanner.NextInt();
b = scanner.NextInt();

C#

string line = Console.ReadLine();
int[] vals = line.Split(' ').Select(int.Parse);
int a = vals[0];
int b = vals[1];
tannergooding commented 2 years ago

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

davidfowl commented 2 years ago

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

It's a really hard to use API that works over a buffer. We need something that works over a TextReader.

AlexGames73 commented 2 years ago

The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)

Anyway you have to read whole line to use the Utf8Parser (or it is not so obvious).

tannergooding commented 2 years ago

It's a really hard to use API that works over a buffer. We need something that works over a TextReader.

Console would be the same ;)

That's a question of extending the Utf8Parser and theoretical Utf16Parser to better support other streams or buffers rather than updating Console specifically.

deeprobin commented 2 years ago

I am not sure if such an API is really useful. But if this is in consideration we should also add TryRead* methods similar to int.TryParse.

davidfowl commented 2 years ago

This API is extremely useful and should be based on StreamReader

epeshk commented 1 year ago

This API could be also implemented over ISpanParsable<TSelf> interface.

As an example, I've draft-implemented both versions based on ISpanParsable<TSelf> and Utf8Parser in https://github.com/epeshk/epeshk.text

Maybe this will inspire someone for a better API proposal. Or at least will be useful as copy-paste code

AraHaan commented 1 year ago

Hmm I think a generic terminal class that implements what console can do currently (and possibly add more to it) would be great.

It could also simplify the console class to just this as well:

// console becomes an alias to terminal to prevent existing code from breaking.
public sealed class Console : Terminal // or whatever modifiers console uses currently.
{
}
epeshk commented 1 year ago

Note that the Java Scanner is based on regular expressions and does not performs well on simple cases (like just reading integers delimited by whitespaces). Java StreamTokenizer is more efficient, but has ugly API.

For my use case, something simple as this is sufficient: not top speed, but faster than int.Parse(Console.ReadLine()) and zero allocation. And this code is simple enough to just rewrite it from scratch when necessary, e.g. on programming contest without internet access.

public class TextScanner
{
  StreamReader input = new StreamReader(Console.OpenStandardInput(), bufferSize: 16384);
  char[] buffer = new char[4096];

  public int ReadInt()
  {
    var length = PrepareToken();
    return int.Parse(buffer.AsSpan(0, length));
  }

  private int PrepareToken()
  {
    int length = 0;
    bool readStart = false;
    while (true)
    {
      int ch = input.Read();
      if (ch == -1)
        break;

      if (char.IsWhiteSpace((char)ch))
      {
        if (readStart) break;
        continue;
      }

      readStart = true;
      buffer[length++] = (char)ch;
    }

    return length;
  }
}
monoman commented 1 year ago

One thing that seems to be overlooked here is about Culture-dependent parsing (for integers it is less of an issue). If only support for the 'common' subset ('+', '-' and digits for integers, '.' as decimal separator and 'e' to separate exponents for floating point, english-only but case-insensitive 'true|false' for boolean) I would advise to have optimized extension methods for StreamReader/TextReader and Pipes

monoman commented 1 year ago

Did some implementation bits to toy around at: https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System.IO/TextReaderExtensions.cs

Use as

using System.IO;

int firstvalue = Console.In.ReadInt32();
int secondvalue = Console.In.ReadInt16();

or import statically https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System/ConsoleExtras.cs

using static System.ConsoleExtras;

int firstvalue = ReadInt32();
int secondvalue = ReadInt16();

Forgot to implement ReadBoolean(), maybe tomorrow

Uses System.Numerics.INumber so only works in C# 11/.NET 7.0