laravel / ideas

Issues board used for Laravel internals discussions.
938 stars 28 forks source link

Number Format F(1000) = 1k, F(1000000) = 1M ... #1293

Open TheoRelativity opened 6 years ago

TheoRelativity commented 6 years ago

Hello everyone! I didn't found on Laravel a function that converts integers to a shorter form like this one f(10000) = 10k. Online I only found functions that are very time-consuming. I am currently developing a social network and I had dealt with this problem, so, I developed a pretty good code to transform an integer to a more human-readable version. It is very fast and can handle very large numbers without having to deal with "if else conditions". More is larger the integer, faster is this function against to a more traditional function, like this, for instance.

Here the gist of my function intWithStyle

CODE UPDATED

<?php
/****
  @param int $n integer
  return string   if $n >  1000
  return type($n) if $n <= 1000
***/        

function intWithStyle($n)
{
  if ($n < 1000) return $n;
  $suffix = ['','k','M','G','T','P','E','Z','Y'];
  $power = floor(log($n, 1000));
  return round($n/(1000**$power),1,PHP_ROUND_HALF_EVEN).$suffix[$power];
};
Patryk27 commented 6 years ago

That's a pretty neat function - I don't think it should belong to the framework itself though; rather some 3rd-party number-related / string-related / formatting-related library.

Btw, shortenNumeral or something in that fashion might be a better-suited name :-)

TheoRelativity commented 6 years ago

@Patryk27 Thanks I will change its name soon! Did someone know any Laravel library where my function makes more sense? Thanks in advance :)

sisve commented 6 years ago

Since you've pasted the link to your code in other places, and stating things like "it is a better and faster version of the same function", I will intentionally look into the problems with this function...

One could argue that values 1000, 1001 and 999999 is already shorter than 1.000k, 1.0009999999999999k and 999.99900000000002k. Are you sure you're handling your floating point numbers correctly? Perhaps you should increase the precision. A good place to start is at The PHP floating point precision is wrong by default

$zeros = strlen($n)/3;
$power = is_float($zeros) ? (int) $zeros : $zeros - 1;

Here strlen(...) is a log10, and the division is to get the base-1000. The second line is just flooring the value. This can all be replaced with explicit call to log and floor.

$power = floor(log($n, 1000));

Looking at the call to round(); this takes a floating point number and returns a floating point number. You do not want a floating point number as a result, because, as shown earlier, even if you round them they still have precision and other fun things that doesn't work as you expect.

The proper function to use is number_format, which returns a string.

return number_format($n/(1000**$power),3).$suffix[$power];

You are saying stuff like "it is a better and faster version of the same function" in the linked gist. You make lots of claims, but when looking at the functions you call, it's obvious that your function does not work identically to the ones you claim to replace. Your code does not work with different floating point precisions, your code does not remove the zeros at the end when the other code does that, and so on.

TheoRelativity commented 6 years ago

@sisve Thanks for your reply :) I have to say I am not here to say I am better but I posted my code here to try to be helpful.

About your doubts, I have to reply: my function doesn't output something like 1.000k. If you want to try it you can do it using the function I made to test intWithStyle. Watch the entire code into the gist page :)

About this code $power = floor(log($n, 1000)); I already thought to use it but with this function, I am improving the speed too. This piece of code $zeros = strlen($n)/3; $power = is_float($zeros) ? (int) $zeros : $zeros - 1; I wrote is a bit faster because handles big numbers as little ones.

I already made a test against 2 codes and here the results:

Cycles: 999999
MIN: 100000
MAX: 999999999

A elaborated in 1.5912001132965 seconds
B elaborated in 1.6535999774933 seconds

The function A win for 0.062399864196777 seconds. 3.7735767444% faster.

You can try it using the speedTest function I made to test intWithStyle, always on the gist page of the function.

speedTest(999999,100000,999999999,$a,$b,10);

Where $b is

$b = function ($n)
{

if ($n < 1000) return $n;

$suffix = ['','k','M','G','T','P','E','Z','Y'];

$power = floor(log($n, 1000));

return round($n/(1000**$power),3).$suffix[$power];

};

About the
return number_format($n/(1000**$power),3).$suffix[$power];

is a bit slower than round and, about the decimal to show, this is a value that the user can change at any time. I prefer to show 3 decimals to have something like F(1450000) = 1.45M

You can test your values using this test(1,1450000,1450000,$a);

Miguel-Serejo commented 6 years ago

Your test code is flawed.

You want your test data to be constant and spread over all possible values. With your current test code, it's possible to have a test run that only generates numbers under 1000, which completely invalidates any results you may have gotten so far.

You also want to repeat the same test for the same data several times and average the results. I published a gist with a revised test, along with alternate implementations: https://gist.github.com/36864/3210563364367f29cf87bec5de49647d

On my machine, using floor(log()) is an obvious improvement over the strlen() approach, but number_format() is significantly slower than round().

Also note that your function simply doesn't work for numbers over 10**19:

>>> intWithStyle(10**18)
=> "1E"
>>> intWithStyle(10**19)
=> "10000000000000M"
>>> intWithStyle(10**20)
=> "1.0E+14M"
TheoRelativity commented 6 years ago

Thanks @36864 for your analysis and your gist :) I will insert a link to your test into my gist, replacing my flawed test with yours. About the incorrect results that function returns, I will try to fix it as soon as possible.

sisve commented 6 years ago
  1. Regarding strlen vs log; one is incorrect. Guess which one handles non-integers and values like 1000.00000001...
  2. Regarding round vs number_format; once again one is incorrect. Speed doesn't matter if it's not producing correct results.

Even if you think you've found some hacks that works in special cases, you still havn't documented those cases and the limitations of the function.

https://3v4l.org/CC0b2

TheoRelativity commented 6 years ago

Thanks @sisve! Your help has been very helpful. I didn't noticed that the function returns wrong values for various n even if you tried to say it to me before XD! I will try to fix the code as soon as I can. Thank you, again! ;)

TheoRelativity commented 6 years ago

@sisve @36864 Thanks to you both I improved the code. I think that now works properly.

Here the new code

function intWithStyle($n)
{
  if ($n < 1000) return $n;
  $suffix = ['','k','M','G','T','P','E','Z','Y'];
  $power = floor(log($n, 1000));
  return round($n/(1000**$power),1,PHP_ROUND_HALF_EVEN).$suffix[$power];
};

I made some tests and seems that everything is correct.

sisve commented 6 years ago
jmarcher commented 6 years ago

This seems like a pretty good exercise for TDD. I recommend you to try out doing this same functionality thru TDD and you will come with a much simpler solution and for sure without the problems pointed here.