shopsys / shopsys

Main repository for maintaining Shopsys Platform packages. Open for ISSUES and PULL REQUESTS.
https://www.shopsys.com
Other
331 stars 91 forks source link

Choosing a type for price representation and calculation #745

Closed PetrHeinz closed 5 years ago

PetrHeinz commented 5 years ago

What is happening

At the moment, we are using a mix of string, int and float types when dealing with monetary values. This leads to problems with type-hinting (see #683 and #288), comparing two values (eg. 0.1 + 0.2 === 0.3 produces false in PHP) and rounding errors, which may get out of hand when a lot of different calculations are compounded (eg. applying a lot of coefficients to a price).

This may lead to prices being off by a cent or to hard-to-debug issues.

I have found 6 different approaches that we could take and I've described them below with their pros and cons. I'd like to hear your opinion on what should get implemented.

1) We will just use floats. 2) We will cast numbers to strings when passing the values to another method. 3) We will implement a custom value-object. 4) We will use a library for Decimal type. 5) We will use a PHP extension for Decimal type. 6) We will use the Money pattern (using a library).

The options are described below in separate comments - please use reactions :+1: and :-1: to express your opinion and add a comment to share your thoughts. Thanks!

We would like to come to a final decision on Thursday 24th, so we have a week to discuss this.

If you find another possible approach you'd like to propose, feel free to add one in a separate comment, preferably in the same format as one of the six. :slightly_smiling_face:

Expected result

Decision is made about how we will represent and calculate prices (at least in the first release after we leave beta). After the changes are implemented, type-hints are consistent (will close #683, #288). Users will have a documented way to upgrade to this state.

Moreover, all no non-strict comparisons like these will be replaced by their strict counterparts:

https://github.com/shopsys/shopsys/blob/2abf7e410a322f97a909f5c4fc03728bd49e64d5/packages/framework/src/Model/Cart/Watcher/CartWatcher.php#L52 https://github.com/shopsys/shopsys/blob/2abf7e410a322f97a909f5c4fc03728bd49e64d5/packages/framework/src/Model/Order/Watcher/TransportAndPaymentWatcher.php#L117 https://github.com/shopsys/shopsys/blob/2abf7e410a322f97a909f5c4fc03728bd49e64d5/packages/framework/src/Twig/PriceExtension.php#L155 https://github.com/shopsys/shopsys/blob/2abf7e410a322f97a909f5c4fc03728bd49e64d5/packages/framework/src/Twig/PriceExtension.php#L170

PetrHeinz commented 5 years ago

1. We will just use floats.

We could use the float type for non-integer numbers. Floats can be type-hinted without any problems. PHP would automatically cast any passed integer to a float (even with strict_types=1).

Using floats might lead to inaccuracies, especially while comparing two values (see Floating point precision in PHP docs). We could come up with some helper functions to address these issues, but it will be always just approximations and hard-to-debug problems may arise.

According to best practices, usage of floats is not a good match for e-commerce applications, where precise calculations of prices matter a lot.

Pros: simple, native PHP solution Cons: hard to debug, rounding errors, against best practices

PetrHeinz commented 5 years ago

2. We will cast numbers to strings when passing the values to another method.

PHP can represent decimal numbers accurately using strings. It also contains a mechanism that rounds floats when cast to string.

Some of the issues found in using floats may be solved by casting all numbers to string when leaving current scope of the method (effectively banning the usage of float as a type hint). This approach would be still amazingly simple and if some method would require more precise computations, it could use eg. BCMath extension.

Even though PHP uses a single format when directly converting a float to string, many alternative strings can be found that have the same numerical value (eg. '10', '10.0' and '1e1'), which may lead to unexpected problems.

Pros: simple, native PHP solution, easier to debug than 1) Cons: problems may arise when using different formats, still uses floats for calculations which may lead to rounding errors, against best practices

PetrHeinz commented 5 years ago

3. We will implement a custom value-object.

We might implement our own implementation of a value object encapsulating a numerical values and offering some behavior (eg. multiplication, addition, rounding).

Pros: flexibility Cons: reinventing the wheel, decreased performance, complicated to implement

PetrHeinz commented 5 years ago

4. We will use a library for Decimal type.

There are already some libraries for arbitrary-precision decimal arithmetic so we don't have to reinvent the wheel. The most used one is probably litipk/php-bignumbers. It offers a class Decimal which can represent a decimal number precise to any number of decimal digits (the precision must be specified during instantiation).

The class can be of course be type-hinted and it is not hard to implement a Doctrine type so Entities might have all their DECIMAL fields mapped to a Decimal object. We would have to rewrite big chunks of price calculation to use class methods instead of operators (eg. $a->add($b) instead of $a + $b).

During prototyping I've found out that using this library comes with a price - calculations are about 100× slower. Framework is not performing a ton of price calculation, so it might not be such an issue.

We could implement our own wrapper for reducing available methods (there is no need for square roots in e-commerce) or renaming them (eg. multiply instead of mul). This would provide some flexibility of 3) for a price of further decreasing peformance, although the additional overhead is not that big.

Pros: arbitrary decimal precision Cons: decreased performance

PetrHeinz commented 5 years ago

5. We will use a PHP extension for Decimal type.

This is basically the same solution as in 4), but using a PHP extension instead of a library. Using the extension would provide significant performance boost in addition to allowing the usage of arithmetic operators and casting (eg $a + $b or (float)$decimal). Because of this it would require less changes in code, on the other hand it would require changes to our Docker images to include the extension and would further complicate native installation.

Type-hinting and creating the Decimal instances (along with the need to specify a precision) would be basically the same as in 4).

EDIT: Unfortunately, the extension is quite new, being released for the first time in 2018-10-28 and it's not widely adopted (see statistics).

Pros: same as 4) + easier usage, better performance, less changes in code Cons: requires installation of a PECL extension and an additional library (libmpdec 2.4+)

PetrHeinz commented 5 years ago

6. We will use the Money pattern (using a library).

Arguably the cleanest way to approach money is the Money pattern, which internally uses integer arithmetic over the smallest possible amount of money (eg. a cent). It also checks whether the prices involved in the calculation are of the same currency, provides methods for international formatting, implementation of exchange rates, solution to the allocation problem etc.

The library moneyphp/money is quite mature and well received among the PHP community.

The most pressing issue with Money integration is that our domain object Price is currency-independent, which would require a lot of BC-breaking changes across the application.

Pros: precise price calculations, solution tailored to e-commerce, additional methods for working with money (eg. allocate) Cons: complicated to integrate, huge BC-break

PetrHeinz commented 5 years ago

Personally, I'd like to implement the Money pattern (6), but it seems like a massive change, that would be hard to implement and hard to upgrade to. Being at the end of the beta phase, I would rather avoid such a BC-break.

Using floats (1) or strings (2) seem like the easiest options, but they do have big problems with accuracy and comparisons. Using them in e-commerce projects for handling monetary values might lead to unexpected issues. Even though I can say I'd prefer using strings, I would rather choose some other option.

From my experience I'd like to avoid reinventing the wheel by custom implementation (3), as it would lead to much effort being spent on already solved issues. I'd rather use some kind of a standard solution.

This train of thought leaves us with 2 options - either use a Decimal class using the litipk/php-bignumbers library (4) or the php-decimal extension (5). Because of the easier usage, less changes to codebase and the fact that we can add and configure a PHP extension pretty easily in Docker images, I'd prefer the option 5.

LukasHeinz commented 5 years ago

We will just use floats. We are doing SSFW for big e-commerce projects and preciseish in pricing is not good enough. We will cast numbers to strings when passing the values to another method. Same as above - can cause problem in crucial part of SSFW. We will implement a custom value-object. Please do not reinvent the wheel We will use a library for Decimal type. Kind of easy solution using quite reliable library (105 stars, 200k installs) We will use a PHP extension for Decimal type. The extention is here for 3 months and it was downloaded 310 times. I do not belive this much. We will use the Money pattern (using a library). Most difficult but seems that will bring best solution. It is operational one year and it has 2M downloads. Can handle bitcoin and other crazy edge cases.

For me only viable option if we want to resolve situation about prices is usage of Decimal library or Money pattern library. And i would prefer Money pattern

pk16011990 commented 5 years ago

This issue is only about price, but I have more general problems with floating point. For example stock quantities in decimal numbers (it is not implemented in SSFW stable, but It is not rare in eshops).

And option no. 6 Money pattern is not useable for another decimal numbers (for prevent to floating point problem). Or is there solution for work with general decimal numbers (not just money) in Money pattern? I did not study all about Money pattern...

PetrHeinz commented 5 years ago

@pk16011990: It's true that the Money pattern does not solve the issue of floating point precision in general, thanks for pointing that out. The prices are the most important use-case, but we'll need to use either floats, casting to strings or the Decimal class for the other uses of decimal numbers.

I would prefer using either casting to strings (I see it as a better fit than floats) or Decimal using a library (as it's more reliable than the extension, as @LukasHeinz mentioned).

@henzigo: I've noticed that you've down-voted every other option than the Money pattern - what would you want to do about other decimal numbers?

vitek-rostislav commented 5 years ago

Hi @PetrHeinz, thanks for the excellent preparation! :ok_hand: :heart:

My train of thought is very similar to yours. I liked the Money type until @pk16011990 mentioned it does not solve the other common use-cases/problems with floats.

The best solution for me is using Decimal, I would prefer the extension way as I really do not like that massive performance decrease caused by the library :slightly_smiling_face:

grossmannmartin commented 5 years ago

Although I see the problem with decimal numbers in general, please keep the scope of this issue.

Money pattern Personally, I believe the Money pattern is the best choice, but I agree it would be hard to implement. Hard to upgrade probably too, but if we consider this type of change, we have to do it now. In any later stage of a product, it will be nearly impossible.

PHP extension A looked at php extension and seems a bit odd to me. Extension raise warning on data loss (parsing with lower precision). I don't like warnings - its harder to work with them in the application (see http://php-decimal.io/#basic-usage).

And comparing seems a bit odd. Two Decimal objects are equal if their numeric values are equal, as well as their precision.

But I don't think the age of the package or the number of downloads is a limiting factor. It looks just like a simple wrapper around libmpdec library – And this is the basis for Python’s decimal module since Python v3.3).

henzigo commented 5 years ago

@PetrHeinz, @pk16011990 Probably decimal type will be added to money pattern in next version. See https://github.com/moneyphp/money/pull/516

Personally money pattern is the best choice for me and we would be able to use decimal extension for prevent other problems with floats.

TomasLudvik commented 5 years ago

In my opinion, there are only two viable options:

5) PHP extension Decimal (preferred)

6) Money type

simara-svatopluk commented 5 years ago

This issue is about prices. I understand that you want to fix more issues, but when we loose focus, then we'll fix nothing. So I will arrogantly ignore anything that is not related to price issue.

  1. Floats - No - floats doesn't fulfill money use cases

  2. Strings - NO - this is even crazier that floats

  3. Custom value object - No - reinventing wheel

  4. Decimal Type - Yes - we are using prices as decimals in SSFW, we don't care about currency when working with prices. So this approach matches our use-cases best

  5. Extension - No - hard to change, we don't use this approach in SSFW, we use libraries

  6. Money Type - Possible - It is the best choice for working with prices. But in SSFW we have different use-cases - we don't care about currency, but his library forces us to work with currencies. So the match is not perfect

  7. I prefer to create our own Money object that will use internally solution 4 or 6. It would only delegate the responsibility. Our custom Money object will fit our use-cases best and will provide us flexibility to change the used library if our needs changes in the future.

PetrHeinz commented 5 years ago

To summarize the discussion, due to overwhelming negative feedback on using primitives (1, 2) or implementing a custom solution (3) the only relevant options are:

4) We will use litipk/php-bignumbers library for Decimal type. There are concerns about the performance impact of implementing the library. Otherwise, such a class matches our use of decimal numbers in pricing.

5) We will use php-decimal extension for Decimal type. There are concerns about the reliability and maturity of the extension. Also, some issues with the ease of use and flexibility were brought up (https://github.com/shopsys/shopsys/issues/745#issuecomment-454831111, https://github.com/shopsys/shopsys/issues/745#issuecomment-455455605).

6) We will use the Money pattern (using moneyphp/money library). Even though it does not help with precision arithmetic in general (https://github.com/shopsys/shopsys/issues/745#issuecomment-454742363), it is seen as the best choice to be used as a type for prices. It might be hard to implement with the current state of pricing in Shopsys Framework.

We might introduce a wrapper object to delegate the responsibility. This would provide us with the possibility to change the implementation entirely without impacting the users (for a price of performance decrease and maintaining such a class, of course).

simara-svatopluk commented 5 years ago

@PetrHeinz I implemented wrapper object recently in my DDD demonstration project - https://github.com/simara-svatopluk/cart/blob/master/src/Domain/Price.php , https://github.com/simara-svatopluk/cart/blob/master/tests/Domain/PriceTest.php

And it's great, because libraries don't force you to have their interface. You have the interface that suits you the best. Eg. Money library forces you to use cents everywhere (you cannot have 15.5 crowns, you have to have 1500 hellers everywhere).

So if I can summarize my experiences, the easiest and most flexible will be custom object + 4-decimal (not money, because it forces you to currencies)

PetrHeinz commented 5 years ago

After some prototyping in the php-decimal sandbox it seems that the extension might really not be mature enough - there is some unintuitive behavior (eg. (bool)(new Decimal(0)) === true) and the documentation does not match reality (eg. new Decimal(1, 2) == new Decimal(1, 3), the trim method does not exist).

simara-svatopluk commented 5 years ago

@PetrHeinz I care a lot about this decision... In my DDD cart example, the Money patter didn't suite well, so I refactored it to use Decimal. The result is nice https://github.com/simara-svatopluk/cart/blob/master/src/Domain/Price.php

PetrHeinz commented 5 years ago

So after a long internal debate, particularly with @simara-svatopluk we have come to a conclusion based on all opinions, suggestions and ideas in this issue. Thanks a lot for every contribution :+1:

Even though the Money type seems like a superior concept and adopting it would help avoiding some issues currently found in pricing, it would be much harder to implement. We would like to get to implementing Money type eventually, but in iterative steps.

First, we'd like to address only the issue of types (and type-hints) of monetary values.

The Decimal type is the closes match for the use cases found in Shopsys Framework. Prices are always just decimal numbers, not being always tied to a currency. Implementing the Decimal type seems like a natural step. We will rather use the library than the extension because of reasons explained above (https://github.com/shopsys/shopsys/issues/745#issuecomment-454831111, https://github.com/shopsys/shopsys/issues/745#issuecomment-455455605, https://github.com/shopsys/shopsys/issues/745#issuecomment-457514485).

To simplify future development (especially in the direction of implementing Money type), we will implement a custom wrapper class Money (alternatively MonetaryDecimal, DecimalMoney, MoneyAmount or something along those lines). This wrapper will be then used to represent all monetary values in Shopsys Framework just as they are represented now by int/string/float. A custom Doctrine type will exist, basically copying \Doctrine\DBAL\Types\DecimalType, but converting to/from the instances of our wrapper class. A form data transformer will exist to help with form fields (basically just the \Symfony\Component\Form\Extension\Core\Type\MoneyType).

The wrapper class will also be used to reduce available methods (there is no need for square roots in e-commerce) or rename them (eg. multiply instead of mul). All arithmetic operations and comparisons will be done using a method call. There will be no __toString method so we avoid unintentional casting.

Thanks to having the functionality wrapped and isolated, we can confront the performance issues if they appear to be a problem. The Decimal library can be also used to tackle other float-related problems as @pk16011990 mentioned in https://github.com/shopsys/shopsys/issues/745#issuecomment-454742363.

The current logic and structure will remain the same as much as possible. Eg. the Rounding class will still round prices (represented by the new wrapper class) without VAT and VAT amounts to 2 decimal numbers and the price with VAT to either integers, halves or hundredths.

The only code that will be removed is that which was implemented to bypass the floating point errors, eg.: https://github.com/shopsys/shopsys/blob/5174c8fa54dc03f89ac4380110c1c5e35b5223c2/packages/framework/src/Twig/PriceExtension.php#L243-L245

Keep in mind than this is just a first step in improving the pricing. And, as always, we'll appreciate your feedback. :slightly_smiling_face:

TomasLudvik commented 5 years ago

@PetrHeinz I am bit confused and concerned about this decision. MoneyType got most votes, but we cannot implement it in time. That is ok, but second one was the Decimal extension. Its cons are nothing to worry about and pros are faster implementation, better performance and easier usage than the library.

I did not prototype it so I might not see some problems there, but I see 100x decreased performance and I think about big e-commerce solutions with thousands of products and many background jobs connected to information systems that do a lot of computing, where this is something to be bother with.

Can you do some final comparison between Decimal type using / not using library and show us real pros and cons from usage / performance / extendability view?

PetrHeinz commented 5 years ago

@TomasLudvik The maturity, documentation and adoption of the mentioned extension felt like a big disadvantage. The advantages of the extension were a bit diminished by the planned wrapper class we want to implement for easier path to Money type adoption - performance would be a bit decreased by the object usage and ease-of-use would be removed altogether, as the wrapped would not support the arithmetic operators etc.

While I don't have any final comparison between the Decimal extension and library, I did a quick performance test of the Decimal library against the usage of primitives with these results (from which the claim about the hundredfold performance decrease originates):

primitives Litipk\BigNumbers\Decimal custom wrapper around Litipk\BigNumbers\Decimal
plain floats and integers using Decimal with unspecified scale* using Decimal with scale of 32 internally
calculations 0.00133 s 0.17511 s 0.15651 s
object instantiation 0 s 0.02697 s 0.03540 s
total 0.00133 s 0.20208 s 0.19191 s

The tested calculations were of this sort:

and there were 10 thousand of them to take the times cited above on my local machine.

You can re-run the test yourself or try other approaches to assess the performance in DecimalTestCommand in my prototyping branch ph-decimal-prototype.

* - The unspecified scale caused rounding errors because of Decimal::fromInteger() results in scale being 0 and the auto-scaling of division is only "reasonably" precise. This can be avoided in the real implementation, especially with the wrapper class.

Even though the performance is 100× lower, for e-commerce related use cases it should be fast enough, in my opinion. The other factors out-weighted the decision in favor of the library.