Add file size constants (KB, MB, ...)

GSPP commented 2 years ago

It is rather common to have to express a byte count in terms of one of the common units. To make the code more readable we could add some simple helpers to the framework.

public static class BytesPer
{
    public const long KiB = 1024;
    public const long MiB = KiB * 1024;
    public const long GiB = MiB * 1024;
    public const long TiB = GiB * 1024;
    public const long PiB = TiB * 1024;
    public const long EiB = PiB * 1024;
}

long fileSize = 5 * BytesPer.KiB;

public static class Bytes
{
    public static long KiB(long value);
    public static long KiB(double value);
    //...
}

long fileSize = Bytes.KiB(5.7);

I find 10 * BytesPer.MiB or Bytes.MiB(10) a lot better than any of these:

10485760
10 * 1048576
10 * 1024 * 1024
10 * (1 << 20)

In my opinion, these ways of writing numbers are a crutch that we have to use in absence of something clearer.

Implementation considerations:

The Bytes methods validate the argument for negative values, special float values, and overflow exceeding long.MaxValue
Float values are rounded to long using AwayFromZero (definitely don't want the default bankers rounding for file sizes)
The long overloads are not redundant to the double overloads because they offer a guaranteed precise result
I chose double as the float type because the slight performance loss does not seem to matter here compared to the added precision
There could be an analyzer detecting likely beneficial usages. Such usages would be characterized by feeding into a byte-taking API and using a constant that would look nice when expressed with these helpers (e.g. fileStream.SetLength(1024)).
It was suggested that the correct spelling would be KiB. That's probably correct, so I updated this proposal.

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details

It is rather common to have to express a byte count in terms of one of the common units. To make the code more readable we could add some simple helpers to the framework. ```csharp public static class BytesPer { public const long KB = 1024; public const long MB = KB * 1024; public const long GB = MB * 1024; public const long TB = GB * 1024; public const long PB = TB * 1024; public const long EB = PB * 1024; } long fileSize = 5 * BytesPer.KB; public static class Bytes { public static long KB(long value); public static long KB(double value); //... } long fileSize = Bytes.KB(5.7); ``` I find `Bytes.MB(10)` a lot better than any of these: * `10485760` * `10 * 1048576` * `10 * 1024 * 1024` * `10 * (1 << 20)` Implementation considerations: * The `Bytes` methods validate the argument for negative values, special float values, and overflow exceeding `long.MaxValue` * Float values are rounded to `long` using `AwayFromZero` (definitely don't want the default bankers rounding for file sizes) * The `long` overloads are not redundant to the double overloads because they offer a guaranteed precise result * I chose `double` as the float type because the slight performance loss does not seem to matter here compared to the added precision * There could be an analyzer detecting likely beneficial usages. Such usages would be characterized by feeding into a byte-taking API and using a constant that would look nice when expressed with these helpers (e.g. `fileStream.SetLength(1024)`).

Author:	GSPP
Assignees:	-
Labels:	`area-System.Numerics`, `untriaged`
Milestone:	-

Symbai commented 2 years ago

There are already very short snippets doing this on Stackoverflow for example. There are also nuget packages. The use case seems to be relatively small and IF its required, you can easily integrate it yourself right now. The only time I ever needed something like that was when displaying file sizes to end users. As for in-code I actually find numbers more readable than words. I.e. fileStream.SetLength(10 * 1024) vs fileStream.SetLength(Bytes.KB(10)) if there is an analyzer suggesting me to replace a number with words and brackets it would be the very first thing to turn off.

teo-tsirpanis commented 2 years ago

And even the definition of a KB is controversial.

hopperpl commented 2 years ago

I would consider this wrong.

kB is factor 1000... KiB is factor 1024 ... also k for Kilo is lowercase, K for Kibi (Kilo Binary) is uppercase.

That is just the way the SI prefix is defined. And it is important. 100 Mbit/s is 100 1000 1000, and not 100 1024 1024.

It is as such precisely defined in ISO/IEC 80000-13

On another note I would prefer the language to add support for suffixes in general using a suffix operator which would allow to use 10 KiB or 100 kg or 1 m or 1 s.

This could in the future allow physics in the language by applying other operators, eg. 1 m / 1 s becomes an object of type velocity as suffix m creates an object of distance, s creates an object of type time and there is a division operator between both creating an object of velocity.

But that's a topic for another time.

GSPP commented 2 years ago

Then let's call it KiB. That's probably the correct way of doing it.

In computer programming, I have never seen the constant 1000 used. The only case I know of is using it for hard disk capacities (where it's used as a marketing scheme). I'm sure other use cases exist, but programming is naturally aligned on powers of two for all kinds of reasons. For example, anything based on filling entire disk sectors or memory pages.

The proposal could be reduced to just the BytesPer constants. It's a small addition to the framework with little risk. Developer love such little utilities.

GSPP commented 2 years ago

Here's an interesting sample of byte size constants in the BCL:

https://github.com/dotnet/runtime/search?q=1024&type= (876 results)
https://github.com/dotnet/runtime/search?q=1048576 (35 results)
https://github.com/dotnet/runtime/search?q=1000 (0 results)

paule96 commented 2 years ago

I did also has fallen often over this case. So I would really like to see something like this. I think the powershell team would also benefit from this, because the have implemented a conversion between the different size formats.

dotnet / runtime

Add file size constants (KB, MB, ...) #63256