pharo-project / pharo

Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk.
http://pharo.org
Other
1.2k stars 355 forks source link

Computations uninterruptable #10570

Open astares opened 2 years ago

astares commented 2 years ago

Print or inspect in Pharo 9:

(2 raisedTo: 2 * 3 * 13 * 23 * 43 * 131)

and it never returns and it is also not interruptable.

See https://twitter.com/sumim/status/1467009927147552771 for discussion

guillep commented 2 years ago

We have made some observations:

[(2 raisedTo: 2 * 3 * 13 * 23 * 43 * 131)] timeToRun

is under 3 miliseconds.

We have seen the problem is seemingly in the printing (pop up of the playground print-it, and inspector tables). You see the problem when debugging it, stepping at some point starts to be slow, but executing is fast.

guillep commented 2 years ago

Interruption is not possible, because interruption just throws an exception, and when an exception happens it logs the stack (with all its values) into the log file.

Aaaand, well, that calls the same slow printing :) See below the stack of the error handling going back to printing...

0x7ffeefbec3c0 M LargePositiveInteger(Integer)>* 0x115486178: a(n) LargePositiveInteger
    0x7ffeefbec3f8 M LargePositiveInteger(LargeInteger)>* 0x115486178: a(n) LargePositiveInteger
    0x7ffeefbec440 M SmallInteger(Number)>raisedToInteger: 0x51=10
    0x7ffeefbec4a0 M LargePositiveInteger>printOn:base: 0x11d065008: a(n) LargePositiveInteger
    0x7ffeefbec4e0 M LargePositiveInteger(Number)>printOn: 0x11d065008: a(n) LargePositiveInteger
    0x7ffeefbec518 M [] in LargePositiveInteger>printStringLimitedTo: 0x11d065008: a(n) LargePositiveInteger
    0x7ffeefbec558 M String class(SequenceableCollection class)>streamContents:limitedTo: 0x115ef27d0: a(n) String class
    0x7ffeefbec5a0 M LargePositiveInteger(Object)>printStringLimitedTo:using: 0x11d065008: a(n) LargePositiveInteger
    0x7ffeefbec5e0 M LargePositiveInteger(Object)>printStringLimitedTo: 0x11d065008: a(n) LargePositiveInteger
    0x7ffeefbec618 M [] in Context>printDetails: 0x114ef6bb8: a(n) Context
    0x7ffeefbec648 M FullBlockClosure(BlockClosure)>on:do: 0x1153efa08: a(n) FullBlockClosure
    0x7ffeefbec6a0 M Context>printDetails: 0x114ef6bb8: a(n) Context
    0x7ffeefbec6f8 I Context>errorReportOn: 0x114ee8628: a(n) Context
    0x7ffeefbd2268 M [] in SmalltalkImage>logError:inContext: 0x115f000d0: a(n) SmalltalkImage
    0x7ffeefbd22b0 M [] in SmalltalkImage>logDuring: 0x115f000d0: a(n) SmalltalkImage
    0x7ffeefbd22f0 M FullBlockClosure(BlockClosure)>ensure: 0x1152a04d8: a(n) FullBlockClosure
    0x7ffeefbd2340 I SmalltalkImage>logDuring: 0x115f000d0: a(n) SmalltalkImage
    0x7ffeefbd2388 I SmalltalkImage>logError:inContext: 0x115f000d0: a(n) SmalltalkImage
    0x7ffeefbd23d8 I MorphicUIManager(UIManager)>requestDebuggerOpeningForProcess:named:inContext: 0x114be4e80: a(n) MorphicUIManager
    0x7ffeefbd2438 I Process>debugWithTitle: 0x11a8f90b0: a(n) Process
    0x7ffeefbd2478 M [] in UserInterruptHandler>handleUserInterrupt 0x114ef6418: a(n) UserInterruptHandler
    0x7ffeefbd24a8 M FullBlockClosure(BlockClosure)>on:do: 0x114ef64e0: a(n) FullBlockClosure
    0x7ffeefbd24e8 M FullBlockClosure(BlockClosure)>on:fork: 0x114ef64e0: a(n) FullBlockClosure
    0x7ffeefbd2538 I UserInterruptHandler>handleUserInterrupt 0x114ef6418: a(n) UserInterruptHandler
    0x7ffeefbd2578 I OSSDL2Driver>evaluateUserInterrupt: 0x114be5620: a(n) OSSDL2Driver
    0x7ffeefbd25c0 M [] in OSSDL2Driver>processEvent: 0x114be5620: a(n) OSSDL2Driver
    0x7ffeefbd25f0 M FullBlockClosure(BlockClosure)>on:do: 0x114ef61e8: a(n) FullBlockClosure
    0x7ffeefbd2638 M OSSDL2Driver>processEvent: 0x114be5620: a(n) OSSDL2Driver
    0x7ffeefbd2680 M OSSDL2Driver>eventLoopProcessWithoutPlugin 0x114be5620: a(n) OSSDL2Driver
    0x7ffeefbd26c0 I [] in OSSDL2Driver>setupEventLoop 0x114be5620: a(n) OSSDL2Driver
    0x7ffeefbd2700 I [] in FullBlockClosure>newProcess 0x114c1e1b0: a(n) FullBlockClosure
guillep commented 2 years ago

Another way to look at it: reduce a bit the exponent (to make it easily reproducible yet still slow) and profile it.

AndreasSystemProfiler spyOn: [ (2 raisedTo: 1000000) printString ]

Lot of time is just spent in multiplications inside the printOn:base: method

98.5 (1,122)  LargePositiveInteger  printOn:base:
[[                          46.900000000000006 (534)  LargePositiveInteger [LargeInteger]  *
[[                            |46.900000000000006 (534)  Integer  digitDiv:neg:
[[                          31.6 (360)  LargePositiveInteger  printOn:base:
[[                            |13.9 (158)  LargePositiveInteger [LargeInteger]  *
[[                            |  |13.9 (158)  Integer  digitDiv:neg:
[[                            |12.100000000000001 (138)  LargePositiveInteger  printOn:base:
[[                            |  |5.5 (63)  LargePositiveInteger  printOn:base:
[[                            |  |  |2.7 (31)  LargePositiveInteger [LargeInteger]  *
[[                            |  |  |  |2.7 (31)  Integer  digitDiv:neg:
[[                            |  |  |1.7000000000000002 (19)  LargePositiveInteger  printOn:base:
[[                            |  |4.800000000000001 (55)  LargePositiveInteger [LargeInteger]  *
[[                            |  |  |4.800000000000001 (55)  Integer  digitDiv:neg:
[[                            |  |1.6 (18)  LargePositiveInteger [LargeInteger]  -
[[                            |  |  1.6 (18)  Integer  digitMultiply:neg:
[[                            |4.6000000000000005 (52)  LargePositiveInteger [LargeInteger]  -
[[                            |  4.6000000000000005 (52)  Integer  digitMultiply:neg:
[[                          16.1 (183)  LargePositiveInteger [LargeInteger]  -
[[                            |16.1 (183)  Integer  digitMultiply:neg:
[[                          3.9000000000000004 (44)  SmallInteger [Number]  raisedToInteger:
[[                            3.9000000000000004 (44)  Integer  digitMultiply:neg:
guillep commented 2 years ago

One way to avoid computing this is to not print very large integers. However, computing the number of digits suffers from the same problem: it's slow for very large integers.

Alternatively, we can heuristically guess if a number is large by taking a look at the number of bytes it occupies in memory. We have seen there is a linear relationship between the number of digits and the number of bytes

(2 raisedTo: 100000000) sizeInMemory "12500024"
(2 raisedTo: 10000000) sizeInMemory "1250024"

(2 raisedTo: 1000000) sizeInMemory "125024"
(2 raisedTo: 1000000) numberOfDigitsInBase: 10 "301030"

(2 raisedTo: 100000) sizeInMemory "12520"
(2 raisedTo: 100000) numberOfDigitsInBase: 10 "30103"

(2 raisedTo: 10000) sizeInMemory "1264"
(2 raisedTo: 10000) numberOfDigitsInBase: 10 "3011"

(2 raisedTo: 1000) sizeInMemory "136"
(2 raisedTo: 1000) numberOfDigitsInBase: 10 "302"
guillep commented 2 years ago

Also, displayString has the same problem

tesonep commented 2 years ago

Maybe we should display it as in https://www.wolframalpha.com/input/?i=2%5E%2810000000%29

image

2^(10000000) - Wolfram|Alpha
Wolfram|Alpha brings expert-level knowledge and capabilities to the broadest possible range of people—spanning all professions and education levels.
guillep commented 2 years ago

We have tried adding a couple of guards in many of the integer printing methods like:

self sizeInMemory > 100
    ifTrue: [ s nextPutAll: 'Very large integer...' ]

And it seems to be working well. But it requires to review all the printing places dispersed in integers, which are a lot (including the ones called from the inspector, debugger, etc...)

astares commented 2 years ago

Maybe we implement and use a specific #logString for the logs?

logString
   "Returns a string representation of the receiver that is used for logs. Subclasses might want to override to provide a simplified representation."
   ^self printString

But this might be similar effort.

carolahp commented 2 years ago

We can easily and efficiently the number of digits of a given number for a given LargeInteger.

binaryDigits := (x size * 8) - (8 - ((x at: x size) numberOfDigitsInBase: 2)). base10Digits := (binaryDigits / ( 10 log: 2)) ceiling.

So we can print the N first digits. We can extract them with:

x // (10**(base10Digits - 20))

However, this division can be slow for large numbers. We need to continue thinking about this

guillep commented 2 years ago

8 should be word size? :)

tesonep commented 2 years ago

No, it is the byte size. It is the number of binary digits in each element of the large integer.

On Wed, Jan 5, 2022, 17:11 Guille Polito @.***> wrote:

8 should be word size? :)

— Reply to this email directly, view it on GitHub https://github.com/pharo-project/pharo/issues/10570#issuecomment-1005863122, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACWY77JNCFPYPIACYQKCODUURUSZANCNFSM5JLZPLLQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

tesonep commented 2 years ago

To calculate the value we can transform it using logarithm.

aNumber := (2 raisedTo: 2 * 3 * 13 * 23 * 43 * 131).
precisionInBits := 64.

[
digitsInBase2 := aNumber size * 8 - (8 - ((aNumber at: aNumber size) numberOfDigitsInBase: 2) ).
digitsInBase10 := (digitsInBase2 * (2 log: 10)) ceiling.

temp := 0.
inc := 1/2.
((aNumber >> (digitsInBase2 - precisionInBits)) printStringBase: 2) "'1001001001'" do: [ :e |
    e = $1 ifTrue: [ temp := temp + inc ].
    inc := inc / 2   ].

"temp * (1<<digitsInBase2) / (10 ** ( digitsInBase10 - 1))."
decimalPart := 10 ** (temp log + (digitsInBase2 * 2 log) - ((digitsInBase10 - 1) * 10 log)).

 ] timeToRun