haskell / bytestring

An efficient compact, immutable byte string type (both strict and lazy) suitable for binary or 8-bit character data.
http://hackage.haskell.org/package/bytestring
Other
290 stars 140 forks source link

Overhaul Data.ByteString.Builder.RealFloat #636

Open BebeSparkelSparkel opened 9 months ago

BebeSparkelSparkel commented 9 months ago

I have created the following PRs but they seem to be intermingled and perhaps a larger pull would be better

632

633

634

I propose the following

  1. combine the logic of formatFloat and formatDouble into formatFloating :: FloatFormat -> a -> Builder since they have the same logic and use classes to get their specific functions based on the floating type. Continuing exporting the formatFloat and formatDouble interface functions for compatibility.
  2. Replace the constructor of FloatFormat to the constructors of FormatMode and remove the type FormatMode because the precision (Maybe Int) is not used and can be included as a parameter of the constructors
  3. FScientific should have a Char for specifying a lower or upper E
  4. FStandard should have a Maybe Int for the precision
  5. FGeneric should have a Char for specifying a lower or upper E, and two Ints for the inclusive exponent range for printing the standard notation
  6. Allow custom strings for special values +-Infinity +-0 NaN with data SpecialStrings = {..}
  7. remove specialStr and use toCharsNonNumbersAndZero instead because they duplicate logic and it is faster to evaluate
  8. I am unsure if this is optimized away but the case statement that selects the format should not be executed for every floating number to be printed. If not, the \f -> ... should be defined after the case statement and not before.

None of these changes should cause changes to the existing interface.

clyring commented 9 months ago

(Sorry for taking so long to respond.)

BebeSparkelSparkel commented 8 months ago

Builder is an intriguing idea instead of char. I'll have to think about that since it does allow for more than one Word8 to be inserted.

I think 6 and 7 are somewhat related and it seems worth it to consolidate for the faster determination of special values for the standard case.

BebeSparkelSparkel commented 8 months ago

@Bodigrim @clyring I am having a performance problem with trying to consolidate the special printers. When I apply the changes in commit the performance for printing NaN, Infinity, and 0 decreases by 20-70%. Do you have any insight as to why this is?

Baseline profile created from branch realfloat-bench with cabal bench --benchmark-options '--csv baseline.csv -p /RealFloat/'

Compared to baseline on branch consolidate-special-real-number-printers with cabal bench --benchmark-options '--baseline baseline.csv -p "/FSci/ && /Special/ && $NF ~ /Float/"'

clyring commented 8 months ago

I see you're doing a lot of work in various PRs in this area. Please let me know when each one is ready for review, and which ones to start with/prioritize.

BebeSparkelSparkel commented 8 months ago

@clyring Please focus on

  1. 638 -- Adds tests that will fail without #637

  2. 637 -- the big one

The others can come later.