Open AlexGames73 opened 2 years ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
I tag area-System.Console command: @jeffhandley @adamsitnik @jozkee
Tagging subscribers to this area: @dotnet/area-system-console See info in area-owners.md if you want to be subscribed.
Author: | AlexGames73 |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Console`, `untriaged` |
Milestone: | - |
I suppose there could be an argument for dotnet including something like Java's Scanner class. But it shouldn't be specific to Console.
There can be a class parsing such values from any TextReader
. It will serve like fscanf
in C.
Is this useful outside of coding competitions? I wonder if this could just be a NuGet package.
Also worth noting that if something like this were taken then ReadBool
, ReadInt
, ReadLong
, ReadFloat
, etc are all the incorrect name: https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/general-naming-conventions#avoiding-language-specific-names
As per the docs, the name is Boolean
, Int32
, Int64
, Single
, etc. All matching the name of the corresponding type in System
As for reading the things, I would rather have them use the type's TryParse
methods so that way if they cant be decoded properly they can throw directly from the Console class itself.
I also agree with tanner, the names would have to be ReadBoolean
, ReadInt32
/ReadUInt32
, ReadInt64
/ReadUInt64
, ReadSingle
, etc.
I agree with you all that naming is important thing, but unfortunately, class Console
prohibits the use of UInt32 and UInt64 as return type of methods.
Also i think that TryParse
method (may be generic method) are not about reading datas, it is about parsing something from parameter of method.
I was thinking more about creating methods ReadBoolean
, ReadInt32
, ReadInt64
, ReadSingle
, etc. (ReadUInt32 and ReadUInt64 only if prohibits will have been declined), or about creating separate api like Scanner
in Java.
I am pretty sure that UInt32 and UInt64 is ok anywhere I think where it is needed if there is no other way to represent something). For me, perhaps I want it to directly read in some unsigned number someone imputted into console which cannot be downcasted to a signed one as it might overflow (which is sometimes not good at all).
Besides, if we are going to do this much changes into Console for additional Read methods, we might just as well go all the way with the built in types to C#.
Programming competition judge systems disallows any external references. Solution on competitions usually is just a single file of code. So it's useless to have this API in a NuGet package.
Of course, this shouldn't be limited to Console
.
i think it's nice to have these APIs even if only practical use case are competitions, because:
.ReadInt32()
is simpler to write and read than int.Parse(Console.ReadLine().Split()[0])
. Also these APIs will be useful when input format is not strict about whitespaces and line breaks.Programming competition judge systems disallows any external references
It is not the job or role of the BCL to appease limitations of programming competitions. Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core". Even for the case of something like the math APIs for C/C++, some implementations (Unix) have it be an explicit separate reference (libm
).
That doesn't mean it shouldn't or couldn't be included; its just not something that we typically consider as a driving reason to do it.
Many languages (such as Rust) have very small standard libraries and it is the expectation that users pull in external dependencies even for things that some other languages consider "core".
On other side C++/Java/Go/Kotlin have APIs like this.
BCL already not so small, e.g. includes JSON parsing APIs. Plain-text parsing is useful for competitions as JSON parsing useful for web apps and microservices.
System.Console
already exposes 89 public method and properties. I don't believe that we should add more, especially since we might introduce a new Terminal
oriented abstraction for it (https://github.com/dotnet/runtime/issues/52374).
Another thing are delimiters. In the provided sample implementation the code handles a space and a new line. But how about tab? How about other delimiters?
What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?
How about other delimiters?
For typical programming contest input any whitespace char (char.IsWhitespace
) is good delimiter IMHO. Don't know about other use cases
expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"
FormatException
if it is treated as Int32/Boolean/etc and "123true456false" for ReadToken
(naming from original proposal)
For me, personally, it's so sad to see how students give up C# to C++ and Java for competitions only due to lack of these APIs.
No, it's not possible to fix judge systems, because there are too many of them.
Only known workaround is using prewritten boilerplate code, but it's not always allowed on on-site contests (where all code must be written at the contest time)
What would be the expected output of trying to parse a line without whitespaces that contain data like this: "123true456false"?
This case is incorrect, since even you will not be able to answer the question of what tokens are present here (123/true/456/false [Int32/Boolean/Int32/Boolean] or 123t/rue/456/false [String/String/Int64/Boolean]). Of course, we can make method based on regex expressions, but it will be slower than usual...
By the way, this API will help companies to test hired employees to .NET Developer position, because automatic testing systems less wasteful and more efficiency than "interview tasks".
this API will help companies to test hired employees to .NET Developer position
They could ask to write a method with a prewritten signature.
Oh we want the scanner class from Java https://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html. This shouldn't be tied to the console APIs.
PS: This is very useful in programming contests when reading input 😄 . FWIW it's still painful in C#.
C++
int a, b;
cin >> a >> b;
Scanner scanner = new Scanner(System.in);
int a, b;
a = scanner.NextInt();
b = scanner.NextInt();
C#
string line = Console.ReadLine();
int[] vals = line.Split(' ').Select(int.Parse);
int a = vals[0];
int b = vals[1];
The Utf8Parser
already works like a "scanner". This probably just needs a Utf16Parser
as well (CC. @GrabYourPitchforks)
The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)
It's a really hard to use API that works over a buffer. We need something that works over a TextReader.
The Utf8Parser already works like a "scanner". This probably just needs a Utf16Parser as well (CC. @GrabYourPitchforks)
Anyway you have to read whole line to use the Utf8Parser
(or it is not so obvious).
It's a really hard to use API that works over a buffer. We need something that works over a TextReader.
Console would be the same ;)
That's a question of extending the Utf8Parser and theoretical Utf16Parser to better support other streams or buffers rather than updating Console specifically.
I am not sure if such an API is really useful.
But if this is in consideration we should also add TryRead*
methods similar to int.TryParse
.
This API is extremely useful and should be based on StreamReader
This API could be also implemented over ISpanParsable<TSelf>
interface.
As an example, I've draft-implemented both versions based on ISpanParsable<TSelf>
and Utf8Parser
in https://github.com/epeshk/epeshk.text
Maybe this will inspire someone for a better API proposal. Or at least will be useful as copy-paste code
Hmm I think a generic terminal class that implements what console can do currently (and possibly add more to it) would be great.
It could also simplify the console class to just this as well:
// console becomes an alias to terminal to prevent existing code from breaking.
public sealed class Console : Terminal // or whatever modifiers console uses currently.
{
}
Note that the Java Scanner
is based on regular expressions and does not performs well on simple cases (like just reading integers delimited by whitespaces). Java StreamTokenizer
is more efficient, but has ugly API.
For my use case, something simple as this is sufficient: not top speed, but faster than int.Parse(Console.ReadLine())
and zero allocation. And this code is simple enough to just rewrite it from scratch when necessary, e.g. on programming contest without internet access.
public class TextScanner
{
StreamReader input = new StreamReader(Console.OpenStandardInput(), bufferSize: 16384);
char[] buffer = new char[4096];
public int ReadInt()
{
var length = PrepareToken();
return int.Parse(buffer.AsSpan(0, length));
}
private int PrepareToken()
{
int length = 0;
bool readStart = false;
while (true)
{
int ch = input.Read();
if (ch == -1)
break;
if (char.IsWhiteSpace((char)ch))
{
if (readStart) break;
continue;
}
readStart = true;
buffer[length++] = (char)ch;
}
return length;
}
}
One thing that seems to be overlooked here is about Culture-dependent parsing (for integers it is less of an issue). If only support for the 'common' subset ('+', '-' and digits for integers, '.' as decimal separator and 'e' to separate exponents for floating point, english-only but case-insensitive 'true|false' for boolean) I would advise to have optimized extension methods for StreamReader/TextReader and Pipes
Did some implementation bits to toy around at: https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System.IO/TextReaderExtensions.cs
Use as
using System.IO;
int firstvalue = Console.In.ReadInt32();
int secondvalue = Console.In.ReadInt16();
or import statically https://github.com/interlockledger/interlockledger-commons/blob/main/InterlockLedger.Commons/Extensions/System/ConsoleExtras.cs
using static System.ConsoleExtras;
int firstvalue = ReadInt32();
int secondvalue = ReadInt16();
Forgot to implement ReadBoolean(), maybe tomorrow
Uses System.Numerics.INumber
Background and motivation
Many lovers of C# like me used console class for programming competitions and faced to a problem like 'TimeLimit' or 'OutOfMemory' verdicts. It is related to very slowly reading from console because console reads whole line in RAM (like in Python), it is their main problem at all in programming compretitions. I want to extend console class with 'read' methods (like 'cin' in C++), which will significantly optimize reading from the console and increase the popularity of using the language in competitions.
API Proposal
API Usage
Alternative Designs
No response
Risks
No response