Open rakudrama opened 3 years ago
cc @jensjoha
I think I'm misunderstanding something (or confused or something)...
From what I see in the above (and my looking at the source), dart2js has a class called CompilerSourceFileProvider
(pkg/compiler/lib/src/source_file_provider.dart
) which is a SourceFileProvider
(same file) which has an explicit map Map<Uri, api.Input> binarySourceFiles
where api.Input
is from pkg/compiler/lib/compiler_new.dart
with a field T get data;
which (from the above) is the _Uint8List
with lots of data int it.
All of it actively retained by dart2js itself and having nothing to do with kernel or the front_end..?
Anyway, to be more concrete on the questions:
Yes, dart2js needs to release this structure, but doing so will not help because there are other references to the buffer. This is why I am adding the front-end to the issue.
This is the back-end of dart2js, where we do not need very much of the Kernel representation (but it is unclear exactly what because there is lazy access in dart2js too that falls back on lazy access to the .dill
).
We load too much of it, but not the bodies of resolution-time tree-shaken methods, which is a large proportion of the methods.
I see a lot of closures also retaining the 770MB buffer.
What I meant by a 'stream or list of blocks':
If the input was segmented in some way (a list of blocks), could a block be removed from the list (i.e. list entry nulled-out) when converted to data?
For example, the source file byte buffers are in memory twice - as a sequence of the .dill
bytes and their own bytes.
If the file was segmented into blocks, these could be moved to a list of blocks for the source, somewhat like transferring the ownership of blocks completely within the region.
Blocks could be fixed-size, or the .dill
could contain a directory of fortuitous breaks that would allow subsequences of large strings and source data to be moved without fragmentation.
I'm open to other approaches that reduce either the heap size and/or address space size (both are constraints on our build).
I instrumented readStringTable for my example 770MB .dill
file, and it shows that the string table is nearly 200MB.
It would be great if the string table could be freed after conversion to strings.
StringTable 572078172..764765399 = 192687227
This is the final state of modular dart2js with two code shards. The files are
m.dill
,m.dill.data
,m.code0
andm.code1
. The buffers are 1.7GB out of a total heap of ~10GB. This is just before printing sizes./cc @johnniwinther I wonder if the Kernel deserialization could be made to work on a stream or list of blocks, so consumed bytes can be discarded, and lazy-parsed regions block-copied to smaller buffers.