Closed maenu closed 5 years ago
Hi @maenu: Thank you for your feedback. I transferred Coder issues to the GToolkit repository as this is the place we mange GToolkit issues.
Hi @maenu, just responding to your last point:
There should be plenty of tools to track down these memory leaks. The best starting point used to be SystemReporter:
SystemReporter open
and then selecting the SpaceAnalysis pane which would have provided a breakdown of space allocated in the system by class. But someone much cleverer and smarter than I am decided it wasn't useful and deleted the facility. Had the facility existed one would have then chosen any class that seemed anomalous, such as
MCVersionName 1303 36194 1M 0.7
, inspected theClass allInstance, and in the Inspector chosen e.g. "chase pointers" which would have answered a path of references back to the roots, all inspectable:
globals: Environment declarations: IdentityDictionary
registry: Dictionary a MCPackage(VersionNumber) -> MCWorkingCopy ancestry: MCWorkingAncestry ancestors: Array 1: MCVersionInfo ancestors: Array 1: MCVersionInfo ancestors: Array 1: MCVersionInfo name: MCVersionName
but for reasons I can;'t understand the GToolkit inspector only provides "pointers to", which does one step in the process of finding the path back to the roots.
So good luck in finding the source of the memory leak. I can't find the tools that were there for doing precisely this. It is discoveries like this (someone asks a reasonable question to which there is a reasonable answer, I explore Pharo to check if the solution is there and what its particular syntax might be, but instead I discover the facility is either missing or obscured) that continually undermine my confidence in Pharo as a platform to support my work.
Well, there is SpaceTally printSpaceAnalysis
which prints an STspace.text
file that contains actually useful memory statistics per class. But indeed, something to inspect would be cooler. It should be faster to compute as well, I bet the VM keeps some statistics to derive those metrics.
Seems to be that there are too many prefix trees, the number of contexts is also surprising. The collections could be explained by those strings. But I do not yet know who hold references to these prefix trees.
Class | code_space | #_instances | inst_space | percent | inst_average_size |
---|---|---|---|---|---|
Array | 4249 | 2259171 | 155887240 | 19.50 | 69.00 |
GtPrefixTree | 4063 | 2460825 | 98433000 | 12.30 | 40.00 |
Context | 29251 | 1198023 | 76434656 | 9.50 | 63.80 |
ByteString | 2942 | 2733398 | 71439824 | 8.90 | 26.14 |
BlockClosure | 14727 | 1280245 | 41527136 | 5.20 | 32.44 |
SortedCollection | 1934 | 954952 | 38198080 | 4.80 | 40.00 |
CompiledMethod | 25805 | 179928 | 16645528 | 2.10 | 92.51 |
OrderedCollection | 6923 | 507280 | 16232960 | 2.00 | 32.00 |
Point | 9879 | 536220 | 12869280 | 1.60 | 24.00 |
ByteArray | 14586 | 9015 | 11867864 | 1.50 | 1316.46 |
MethodDictionary | 3460 | 30100 | 10181320 | 1.30 | 338.25 |
BlObjectProperty | 275 | 156805 | 8781080 | 1.10 | 56.00 |
Association | 1479 | 354617 | 8510808 | 1.10 | 24.00 |
Semaphore | 1711 | 243656 | 7796992 | 1.00 | 32.00 |
BlHandlerAnnouncementSubscription | 469 | 375895 | 6014320 | 0.80 | 16.00 |
BlLayoutCommonConstraints | 2796 | 57707 | 6001528 | 0.70 | 104.00 |
Bitmap | 4742 | 1770 | 5857448 | 0.70 | 3309.29 |
IdentitySet | 408 | 240022 | 5760528 | 0.70 | 24.00 |
ValueLink | 452 | 202952 | 4870848 | 0.60 | 24.00 |
BlElementChangeRecord | 482 | 200479 | 4811496 | 0.60 | 24.00 |
WeakArray | 1828 | 28459 | 4490560 | 0.60 | 157.79 |
BrTextElementWithCursors | 1712 | 15008 | 4442368 | 0.60 | 296.00 |
BlElement | 56052 | 18915 | 4388280 | 0.50 | 232.00 |
BlEventHandler | 867 | 157956 | 3790944 | 0.50 | 24.00 |
Total | 22427541 | 19485258 | 80138176096.80000000000001 |
Just a quick update on the memory leak. I ran some different kind of images to see who they evolve on memory consumption.
Maybe I was wrong to search within the image, because the image memory itself seems not to grow that high. There seems to be loads of memory allocated by the VM that are not used by the image. What's in that memory, how can I explore it? Could this be related to UFFI memory not freed? How to I debug this?
Edit I can also save the 8GB image and end up with a ~515MB file. Spinning it up again occupies ~554MB in the beginning. But as soon as I start using GtInspector/Playground, memory hogging starts again and is unstoppable.
Thanks for the input! It looks that magic is happening ...
@maenu Would it be possible to run in your image to run
GtSourceCoder withAllSubclasses collect: [:each | each -> each allInstances size]
?
I think you have a bunch of coder objects that aren't being released properly. Can you close all of your coder windows, do a Smalltalk garbageCollect
, and then do GtMethodCoder allInstances
. If you have any instances, you should be able to evaluate ReferenceFinder findPathToInstanceOf: GtMethodCoder
. This should return the shortest path from the Smalltalk
global to a GtMethodCoder
that doesn't go through a weak pointer.
@maenu to reply to your points:
First, thanks a lot for the time you spend on GT, and for the detailed input. This is really helpful.
1) We want GtMethodCoder
to be easily extensible. We focused on the parts in the editor, but you have more advanced needs that we did not tackle so far. Would it be possible to list concretely what you need to extend?
2) The highlighter and completion use the SmaCC AST. Is the problem you encounter due to the fact that you annotate types on the compiled AST and you'd like to use them in completion?
3) Do you have a concrete case that will help us reproduce the problem you are seeing with catching errors?
4) The same for this point. A way to look at a concrete problem would be really helpful. Any chance you can help us see when a code leads to a TaskIt swamp :)?
5) This one is already being discussed.
@maenu to reply to your points:
First, thanks a lot for the time you spend on GT, and for the detailed input. This is really helpful.
- We want
GtMethodCoder
to be easily extensible. We focused on the parts in the editor, but you have more advanced needs that we did not tackle so far. Would it be possible to list concretely what you need to extend?
See TypGtMethodCoder.
- The highlighter and completion use the SmaCC AST. Is the problem you encounter due to the fact that you annotate types on the compiled AST and you'd like to use them in completion?
No, the coder styler uses the RB AST, see GtCoderTextStyler >> #privateStyle:
.
- Do you have a concrete case that will help us reproduce the problem you are seeing with catching errors?
See #return senders select: [ :e | e methodClass name beginsWith: 'Gt' ]
. Again, see GtCoderTextStyler >> #privateStyle:
. It returns an exception without telling anybody else. This means that if one styler fails, the subsequent styler are not applied, the developer does not have any clue where and why his styler was not applied to the coder. Similar problems arise with many exception returns.
- The same for this point. A way to look at a concrete problem would be really helpful. Any chance you can help us see when a code leads to a TaskIt swamp :)?
Halt once.
in GtCoderTextStyler >> #privateStyle:
String >> #asString
to see a coder.TKTDebugSession >> #filteredMasterStack
, receiver of selector is contextHolder originalContext
- This one is already being discussed.
I used a fresh GT image from the 03.01.2019 for the following experiment. It seems like PetitParser memento keeps a strong reference to the document containing the play page and the coder. This could mean that everything that ever was in a Gt2Document will never be garbage collected, ew.
GtSourceCoder withAllSubclasses collect: [:each | each -> each allInstances size].
"GtBlockCoder->0
GtMethodCoder->6
GtPharoSnippetCoder->3
GtMethodContextCoder->0
GtSourceCoder->0"
Smalltalk garbageCollect.
GtSourceCoder withAllSubclasses collect: [:e | e -> (e allInstances collect: [ :f | ReferenceFinder findPathTo: f ]) ]
"see fueled traces or archive on dropbox"
Full macOS VM and image: https://www.dropbox.com/s/nkjz1snh76rf5ld/pharo-2019-01-03-22-26-47.zip?dl=0 Fueled traces from coders to root: https://www.dropbox.com/s/cbfdy59s84xi3be/traces.fuel?dl=0
The memory leak problem comes from issue #69.
The main part of the memory leak was addressed with a temporary fix consisting in moving the Gt2Document class>>PillarParser
class side variable to the instance side. It comes with a 100ms penalty for starting a document, but it limits the memory leaks to only a few cases (like inspecting the OrderedCollection
class).
Related to 4., it seems that the TaskIt debugger has a problem and that's why we cannot debug an error from inside a task. We opened an issue here: https://github.com/sbragagnolo/taskit/issues/47
@maenu #101 addresses point 4). You should now be able to debug errors in tasks.
@maenu #101 addresses point 4). You should now be able to debug errors in tasks.
Just checked this, debugging in TaskIt tasks works fine now. Thanks!
Regarding 1) extensibility: I moved most of my extensions to use extension methods on the coder using the addon pragma. This required me to normalize and move around some methods and variables in coder. I pushed the coder changes to https://github.com/maenu/gtoolkit-coder/tree/extensions, the actual extension to https://github.com/maenu/gtoolkit-contribution-typer. Shortcuts, actions, and stylers work quite well with this approach. One remaining problem is due to 2) RB vs. GT AST. I need a stateful AST, as this is currently the best option I found to store type information. This requires me to ensure to only parse to RB when the source changes, otherwise reuse the same RB AST. Then I intercept the parsing with a MetaLink to type the AST. Not super nice, but works.
If you intend to look at my changes to coder and would like to have some more insight on why I changed what how, let me know, I can comment the commit on GitHub.
The extensibility-related changes were now integrated. #104 holds the memory leak conversation.
I will close this issue. If there are remaining concerns, please open individual issues.
I just want to leave some feedback on the experience I gained by extending GtCoder with my Typer https://github.com/maenu/gtoolkit-contribution-typer. There are a few things that I struggled with, I thought this might be useful for you to drive changes regarding the extension mechanism for GtCoder.
It would be nice if you could comment on the points above to have an overview of which issues are known to you and which ones you are actively working on or do not plan to work on. This would give me the chance to focus on the things you are not focussing on. I was also thinking about spinning up an extension prototype by modifying the GToolkit sources, but I guess you are already ahead of me on that part.
Those are the "bad" parts of my experience. Otherwise, I have to say that extending coder is way easier than hacking into the code editor used by Nautilus/Calypso. The main driver here seems to be the moldable editor, provided by Bloc-Texts. Also, the domain model in GtCoder seems way less intertwined with its UI, while in the old editor, responsibilities are sometimes on the Morphs, while they are actually non-UI related. For me, I want to move away from the traditional browser and editor completely, only using method and class coders, as well as navigation in the inspector. The issues mentioned above currently prohibit me from doing so, but I see that you are also getting there more and more. So, thanks for all of that!