godotengine / godot-cpp

C++ bindings for the Godot script API
MIT License
1.75k stars 578 forks source link

Binary bloat discussion #1627

Open jordo opened 1 month ago

jordo commented 1 month ago

Godot version

4.4-dev

godot-cpp version

4.4-dev

System information

wasm in particular

Issue description

This is intended to primarily be a tracking issue looking at ways to reduce the binary size of these extension libraries. Intended to be an open discussion to followup on https://github.com/godotengine/godot-cpp/pull/1621

Steps to reproduce

Compiling the simplest of extensions, a single file single including a single header results in a 2MB library (Reviewer EDIT: macos universal library, i.e. 1MB for each architecture). Example:

#include <godot_cpp/classes/node3d.hpp>
class Example : public Node3D {
    GDCLASS(Example, Node3D);

    protected:
    Vector2 position2;
    static void _bind_methods() {
        ClassDB::bind_method(D_METHOD("test_position"), &Example::test_position);
    };

    Vector2 test_position() const {
        return position2;
    };

    Example() { };
    virtual ~Example() {};
};

Minimal reproduction project

N/A

jordo commented 1 month ago

-- reserved --

jordo commented 1 month ago

Bloaty output from single class extension file, template_release, mach-o format:

Symbols, first 100:

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  24.3%   451Ki  47.0%   451Ki    __MergedGlobals
  23.1%   429Ki   0.0%       0    [__TEXT,__text]
  20.9%   388Ki   0.2%  2.18Ki    [__LINKEDIT]
  17.8%   331Ki  35.7%   342Ki    [2685 Others]
   1.7%  32.1Ki   0.0%      76    [__TEXT,__cstring]
   1.1%  20.2Ki   0.4%  3.76Ki    [__TEXT]
   1.0%  18.2Ki   0.0%       0    [Unmapped]
   0.8%  14.9Ki   0.4%  3.44Ki    [__DATA_CONST]
   0.8%  14.8Ki   0.0%       8    [__DATA]
   0.6%  11.9Ki   1.2%  11.9Ki    godot::Theme
   0.4%  7.22Ki   0.8%  7.22Ki    godot::Variant::Variant()
   0.3%  6.18Ki   0.6%  6.18Ki    godot::internal::EngineClassRegistration<>::callback()
   0.3%  5.49Ki   0.6%  5.49Ki    godot::CharStringT<>::operator wchar_t()
   0.3%  5.10Ki   0.5%  5.10Ki    std::__1::__hash_table<>::__emplace_unique_key_args<>()
   0.2%  4.52Ki   0.0%       0    [__DATA_CONST,__const]
   0.2%  4.33Ki   0.5%  4.33Ki    godot::String::operator%()
   0.2%  4.12Ki   0.4%  4.12Ki    ___cxx_global_var_init
   0.2%  3.73Ki   0.4%  3.64Ki    [__TEXT,__unwind_info]
   0.2%  3.38Ki   0.4%  3.38Ki    godot::Array::init_bindings()
   0.2%  3.11Ki   0.3%  3.11Ki    godot::ClassDB::_register_class<>()
   0.2%  3.00Ki   0.3%  3.00Ki    std::__1::__hash_table<>::__do_rehash<>()
   0.2%  2.93Ki   0.3%  2.93Ki    godot::PackedByteArray::init_bindings()
   0.2%  2.92Ki   0.3%  2.92Ki    std::__1::__hash_table<>::find<>()
   0.2%  2.80Ki   0.3%  2.80Ki    godot::PackedStringArray::init_bindings()
   0.1%  2.07Ki   0.3%  2.72Ki    godot::String::_method_bindings
   0.1%  2.71Ki   0.3%  2.71Ki    godot::StringName::operator%()
   0.1%  2.58Ki   0.3%  2.58Ki    godot::CowData<>::resize<>()
   0.1%  1.29Ki   0.3%  2.58Ki    godot::StringName::_method_bindings
   0.1%  2.25Ki   0.2%  2.25Ki    ___cxx_global_var_init.1
   0.1%  2.05Ki   0.2%  2.05Ki    std::__1::vector<>::__append()
   0.1%  1.95Ki   0.2%  1.95Ki    godot::StringName::init_bindings()
   0.1%  1.82Ki   0.2%  1.82Ki    godot::MethodInfo::MethodInfo()
   0.1%  1.70Ki   0.2%  1.70Ki    godot::Array::Array()
   0.1%  1.66Ki   0.2%  1.66Ki    std::__1::__hash_table<>::__node_insert_multi_prepare()
   0.1%  1.59Ki   0.2%  1.59Ki    godot::_err_print_error()
   0.1%  1.55Ki   0.2%  1.55Ki    godot::StringName::StringName()
   0.1%  1.54Ki   0.2%  1.54Ki    godot::ClassDB::bind_virtual_method()
   0.1%  1.45Ki   0.2%  1.45Ki    godot::PackedInt32Array::PackedInt32Array()
   0.1%  1.45Ki   0.2%  1.45Ki    godot::CowData<>::_copy_on_write()
   0.1%  1.42Ki   0.1%  1.42Ki    godot::NodePath::NodePath()
   0.1%  1.40Ki   0.1%  1.40Ki    godot::PackedVector3Array::PackedVector3Array()
   0.1%  1.37Ki   0.1%  1.37Ki    std::__1::vector<>::__push_back_slow_path<>()
   0.1%  1.35Ki   0.1%  1.35Ki    std::__1::unordered_map<>::unordered_map()
   0.1%  1.35Ki   0.1%  1.35Ki    ___cxx_global_var_init.2
   0.1%  1.35Ki   0.1%  1.35Ki    ___cxx_global_var_init.4
   0.1%  1.34Ki   0.1%  1.34Ki    godot::Basis::rotate()
   0.1%  1.30Ki   0.1%  1.30Ki    godot::String::String()
   0.1%  1.28Ki   0.1%  1.28Ki    godot::PackedInt64Array::PackedInt64Array()
   0.1%  1.26Ki   0.1%  1.26Ki    ___cxx_global_var_init.5
   0.1%  1.25Ki   0.1%  1.25Ki    godot::Projection::set_perspective()
   0.1%  1.24Ki   0.1%  1.24Ki    godot::PackedVector4Array::PackedVector4Array()
   0.1%  1.23Ki   0.1%  1.23Ki    godot::Array::make<>()
   0.1%  1.22Ki   0.1%  1.22Ki    godot::Dictionary::init_bindings()
   0.1%  1.20Ki   0.1%  1.20Ki    ___cxx_global_var_init.3
   0.1%  1.20Ki   0.1%  1.20Ki    godot::PackedColorArray::PackedColorArray()
   0.1%  1.18Ki   0.1%  1.18Ki    godot::CharStringT<>::operator<()
   0.1%  1.18Ki   0.1%  1.18Ki    godot::ClassDB::add_virtual_method()
   0.1%  1.18Ki   0.1%  1.18Ki    godot::Basis::get_axis_angle()
   0.0%     600   0.1%  1.17Ki    godot::Array::_method_bindings
   0.0%     592   0.1%  1.16Ki    godot::PackedByteArray::_method_bindings
   0.1%  1.14Ki   0.1%  1.14Ki    godot::Basis::get_euler()
   0.1%  1.10Ki   0.1%  1.10Ki    godot::PackedByteArray::PackedByteArray()
   0.1%  1.09Ki   0.1%  1.09Ki    godot::ClassDB::bind_methodfi()
  0.1%  1.09Ki   0.1%  1.09Ki    std::__1::__tree<>::__assign_multi<>()
   0.1%  1.09Ki   0.1%  1.09Ki    godot::helpers::append_all<>()
   0.1%  1.08Ki   0.1%  1.08Ki    ___cxx_global_var_init.6
   0.1%  1.07Ki   0.1%  1.07Ki    std::__1::__hash_table<>::__node_insert_multi()
   0.1%  1.04Ki   0.1%  1.04Ki    godot::Wrapped::Wrapped()
   0.1%  1.02Ki   0.1%  1.02Ki    __ZNSt3__16vectorIN5godot10StringNameENS_9allocatorIS2_EEE18__assign_with_sizeB8nn180100IPS2_S7_EEvT_T0_l
   0.1%  1.02Ki   0.1%  1.02Ki    godot::PackedFloat64Array::init_bindings()
   0.1%  1.01Ki   0.1%  1.01Ki    godot::Signal::Signal()
   0.1%    1024   0.1%    1024    godot::PackedFloat32Array::PackedFloat32Array()
   0.1%    1000   0.1%    1000    godot::ClassDB::_create_instance_func<>()
   0.1%     964   0.1%     964    godot::Basis::rotated()
   0.1%     957   0.0%       0    [__TEXT,__const]
   0.0%     952   0.1%     952    godot::Callable::Callable()
   0.0%     944   0.1%     944    godot::Dictionary::Dictionary()
   0.0%     928   0.1%     928    godot::create_method_bind<>()
   0.0%     916   0.1%     916    godot::PackedVector2Array::PackedVector2Array()
   0.0%     912   0.0%       0    [__DATA,__data]
   0.0%     900   0.1%     900    godot::_err_print_index_error()
   0.0%     896   0.1%     896    godot::ClassDB::add_signal()
   0.0%     884   0.1%     884    godot::PropertyInfo::from_dict()
   0.0%     881   0.1%     882    godot::MethodBind
   0.0%     860   0.1%     860    godot::RID::RID()
   0.0%     840   0.1%     840    godot::ClassDB::add_property_group()
   0.0%     840   0.1%     840    godot::GDExtensionBinding::InitObject::InitObject()
   0.0%     816   0.1%     816    godot::Basis::rotate_sh()
   0.0%     816   0.1%     816    godot::PackedFloat64Array::PackedFloat64Array()
   0.0%     796   0.1%     796    std::__1::__hash_table<>::__assign_multi<>()
   0.0%     780   0.1%     780    __ZN5godot7CowDataIDsE6resizeILb0EEENS_5ErrorEx
   0.0%     384   0.1%     768    godot::Dictionary::_method_bindings
   0.0%     748   0.1%     748    godot::EditorPlugins::remove_plugin_class()
   0.0%     732   0.1%     732    godot::Projection::Projection()
   0.0%     728   0.1%     728    std::__1::vector<>::reserve()
   0.0%     724   0.1%     724    godot::Basis::rotate_local()
   0.0%     716   0.1%     716    __ZN5godot7CowDataIDiE6resizeILb0EEENS_5ErrorEx
   0.0%     712   0.1%     712    godot::ClassDB::bind_method_godot()
   0.0%     708   0.1%     708    godot::CharStringT<>::CharStringT()
   0.0%     708   0.1%     708    godot::ClassDB::get_instance_binding_callbacks()
   0.0%     696   0.1%     696    __ZNSt3__13setIN5godot10StringNameENS_4lessIS2_EENS_9allocatorIS2_EEE6insertB8nn180100INS_21__tree_const_iteratorIS2_PNS_11__tree_nodeIS2_PvEElEEEEvT_SF_
 100.0%  1.82Mi 100.0%   960Ki    TOTAL

Compileunits, first 100:

   FILE SIZE        VM SIZE    
 --------------  -------------- 
  28.4%   528Ki  25.4%   243Ki    [__LINKEDIT]
  24.2%   450Ki   2.2%  21.1Ki    [__TEXT,__text]
   3.1%  58.5Ki   5.1%  49.4Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/canvas_item.cpp
   3.1%  56.8Ki   5.6%  53.5Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/node.cpp
   2.4%  45.0Ki   4.7%  45.0Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/window.cpp
   2.3%  43.0Ki   3.9%  37.0Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/string.cpp
   2.3%  42.9Ki   3.7%  35.9Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/string_name.cpp
   2.1%  39.4Ki   3.4%  32.7Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/core/class_db.cpp
   2.1%  38.2Ki   0.6%  6.19Ki    [__TEXT,__cstring]
   2.0%  37.2Ki   3.3%  31.4Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/char_string.cpp
   1.6%  29.7Ki   3.0%  28.6Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/godot.cpp
   1.4%  26.5Ki   2.1%  19.8Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/modules/game/register_types.cpp
   1.3%  24.9Ki   2.5%  23.9Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/quaternion.cpp
   1.2%  22.2Ki   1.7%  16.0Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/variant.cpp
   1.1%  21.0Ki   1.8%  17.7Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/class_db_singleton.cpp
   1.1%  20.9Ki   1.9%  18.5Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/basis.cpp
   1.1%  20.3Ki   2.1%  20.3Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/object.cpp
   1.1%  19.9Ki   1.8%  17.3Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/projection.cpp
   1.1%  19.9Ki   1.7%  16.7Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/array.cpp
   1.1%  19.9Ki   1.7%  16.5Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_byte_array.cpp
   1.0%  18.2Ki   0.0%       0    [Unmapped]
   1.0%  17.7Ki   0.2%  2.15Ki    [__DATA]
   0.9%  16.1Ki   0.2%  1.70Ki    [__TEXT]
   0.8%  14.9Ki   0.4%  3.44Ki    [__DATA_CONST]
   0.7%  12.3Ki   1.0%  9.77Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/mesh.cpp
   0.6%  12.0Ki   1.0%  9.92Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/dictionary.cpp
   0.6%  10.8Ki   0.9%  8.84Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_vector3_array.cpp
   0.6%  10.5Ki   0.9%  8.75Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_float32_array.cpp
   0.6%  10.4Ki   0.9%  8.44Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_vector2_array.cpp
   0.6%  10.3Ki   0.9%  8.67Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_int64_array.cpp
   0.5%  10.2Ki   0.9%  8.45Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_float64_array.cpp
   0.5%  10.1Ki   0.9%  8.31Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_string_array.cpp
   0.5%  10.1Ki   0.9%  8.90Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_color_array.cpp
   0.5%  10.0Ki   0.9%  8.37Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_int32_array.cpp
   0.0%       0   1.0%  9.55Ki    [__DATA,__common]
   0.5%  9.03Ki   0.5%  4.52Ki    [__DATA_CONST,__const]
   0.5%  8.88Ki   0.7%  6.99Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/packed_vector4_array.cpp
   0.5%  8.66Ki   0.8%  7.42Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/callable.cpp
   0.4%  8.31Ki   0.8%  7.25Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/node_path.cpp
   0.4%  8.05Ki   0.7%  7.17Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/core/object.cpp
   0.4%  7.93Ki   0.6%  5.87Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/modules/game/example2.cpp
   0.0%       0   0.7%  7.12Ki    [__DATA,__bss]
   0.4%  6.94Ki   0.7%  6.94Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/resource.cpp
   0.3%  6.50Ki   0.6%  5.56Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/signal.cpp
   0.3%  5.30Ki   0.6%  5.30Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/texture2d.cpp
   0.3%  5.04Ki   0.5%  5.04Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/style_box.cpp
   0.2%  4.16Ki   0.2%  2.05Ki    [Mach-O Headers]
   0.2%  4.10Ki   0.4%  3.69Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/classes/editor_plugin_registration.cpp
   0.2%  4.08Ki   0.4%  3.46Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/plane.cpp
   0.2%  3.94Ki   0.3%  3.20Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/vector3.cpp
   0.2%  3.73Ki   0.4%  3.64Ki    [__TEXT,__unwind_info]
   0.2%  3.73Ki   0.2%  2.05Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/variant/packed_arrays.cpp
   0.2%  3.54Ki   0.3%  2.94Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/variant/rid.cpp
   0.2%  3.42Ki   0.3%  2.90Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/material.cpp
   0.2%  2.98Ki   0.2%  2.21Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/classes/wrapped.cpp
   0.1%  2.61Ki   0.2%  1.80Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/core/method_bind.cpp
   0.1%  2.40Ki   0.2%  1.47Ki    [__TEXT,__const]
   0.1%  2.37Ki   0.2%  1.92Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/core/error_macros.cpp
   0.1%  1.78Ki   0.1%     912    [__DATA,__data]
   0.1%  1.30Ki   0.1%     906    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/main_loop.cpp
   0.1%  1.00Ki   0.1%     834    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/src/core/memory.cpp
   0.0%     568   0.0%       0    [__TEXT,__init_offsets]
   0.0%     464   0.0%     232    [__DATA,__la_symbol_ptr]
   0.0%     372   0.0%       0    [__TEXT,__stub_helper]
   0.0%     372   0.0%       0    [__TEXT,__stubs]
   0.0%      96   0.0%      48    [__DATA,__thread_vars]
   0.0%      96   0.0%      48    [__DATA_CONST,__got]
   0.0%       0   0.0%      16    [__DATA,__thread_bss]
   0.0%       4   0.0%       2    [__TEXT,__ustring]
 100.0%  1.82Mi 100.0%   960Ki    TOTAL
jordo commented 1 month ago

Looking at translation unit binary sizes, godot-cpp/gen/src/classes/canvas_item.cpp (of which I'm not using at all in my extension class, it inherits from Node3D) is 58.5KB, so seems to be a good candidate for further inspection. Let take a look at a candidate function to see whats going on:

void CanvasItem::set_visible(bool p_visible) {
    static GDExtensionMethodBindPtr _gde_method_bind = internal::gdextension_interface_classdb_get_method_bind(CanvasItem::get_class_static()._native_ptr(), StringName("set_visible")._native_ptr(), 2586408642);
    CHECK_METHOD_BIND(_gde_method_bind);
    int8_t p_visible_encoded;
    PtrToArg<bool>::encode(p_visible, &p_visible_encoded);
    internal::_call_native_mb_no_ret(_gde_method_bind, _owner, &p_visible_encoded);
}

The above is the generated godot-cpp gdextension glue for this function. At first glance, seems reasonable. A costly execution path/lookup at first hit, but the resulting method_bind structure is stored in a static local, which ends up as a conditional branch on subsequent calls. The rest is setting up various args and calling through to the godot endpoint. Lets take a look at the dissasembly here:

00047010  55                 push    rbp {__saved_rbp}
00047011  4889e5             mov     rbp, rsp {__saved_rbp}
00047014  4157               push    r15 {__saved_r15}
00047016  4156               push    r14 {__saved_r14}
00047018  4155               push    r13 {__saved_r13}
0004701a  4154               push    r12 {__saved_r12}
0004701c  53                 push    rbx {__saved_rbx}
0004701d  4883ec18           sub     rsp, 0x18
00047021  4189f6             mov     r14d, esi
00047024  4889fb             mov     rbx, rdi
00047027  488b05eaaf0200     mov     rax, qword [rel ___stack_chk_guard]
0004702e  488b00             mov     rax, qword [rax]
00047031  488945d0           mov     qword [rbp-0x30 {var_38}], rax
00047035  0fb6053cfb0200     movzx   eax, byte [rel guard_variable_for_godot...::set_visible(bool)::_gde_method_bind]
0004703c  84c0               test    al, al
0004703e  7445               je      0x47085

00047040  448875c7           mov     byte [rbp-0x39 {var_41}], r14b
00047044  488b3d25fb0200     mov     rdi, qword [rel godot::CanvasItem::set_visible(bool)::_gde_method_bind]
0004704b  488b7310           mov     rsi, qword [rbx+0x10]
0004704f  488d45c7           lea     rax, [rbp-0x39 {var_41}]
00047053  488945c8           mov     qword [rbp-0x38 {var_40}], rax {var_41}
00047057  488d0532d80200     lea     rax, [rel _gdextension_interface_object_method_bind_ptrcall]
0004705e  488d55c8           lea     rdx, [rbp-0x38 {var_40}]
00047062  31c9               xor     ecx, ecx  {0x0}
00047064  ff10               call    qword [rax]  {_gdextension_interface_object_method_bind_ptrcall}
00047066  488b05abaf0200     mov     rax, qword [rel ___stack_chk_guard]
0004706d  488b00             mov     rax, qword [rax]
00047070  483b45d0           cmp     rax, qword [rbp-0x30 {var_38}]
00047074  7577               jne     0x470ed

00047076  4883c418           add     rsp, 0x18
0004707a  5b                 pop     rbx {__saved_rbx}
0004707b  415c               pop     r12 {__saved_r12}
0004707d  415d               pop     r13 {__saved_r13}
0004707f  415e               pop     r14 {__saved_r14}
00047081  415f               pop     r15 {__saved_r15}
00047083  5d                 pop     rbp {__saved_rbp}
00047084  c3                 retn     {__return_addr}

00047085  488d3decfa0200     lea     rdi, [rel guard_variable_for_godot...::set_visible(bool)::_gde_method_bind]
0004708c  e8c1160200         call    ___cxa_guard_acquire
00047091  85c0               test    eax, eax
00047093  74ab               je      0x47040

00047095  488d059cd80200     lea     rax, [rel _gdextension_interface_classdb_get_method_bind]
0004709c  4c8b28             mov     r13, qword [rax]  {_gdextension_interface_classdb_get_method_bind}
0004709f  e8fcfeffff         call    godot::CanvasItem::get_class_static
000470a4  4989c7             mov     r15, rax
000470a7  488d35a8790200     lea     rsi, [rel data_6ea56]  {"set_visible"}
000470ae  4c8d65c8           lea     r12, [rbp-0x38 {var_40}]
000470b2  4c89e7             mov     rdi, r12 {var_40}
000470b5  31d2               xor     edx, edx  {0x0}
000470b7  e88420feff         call    godot::StringName::StringName
000470bc  bac276299a         mov     edx, 0x9a2976c2
000470c1  4c89ff             mov     rdi, r15  {godot::CanvasItem::get_class_static()::string_name}
000470c4  4c89e6             mov     rsi, r12 {var_40}
000470c7  41ffd5             call    r13
000470ca  4989c7             mov     r15, rax
000470cd  4c89e7             mov     rdi, r12 {var_40}
000470d0  e83baefeff         call    godot::StringName::~StringName
000470d5  4c893d94fa0200     mov     qword [rel godot::CanvasItem::set_visible(bool)::_gde_method_bind], r15
000470dc  488d3d95fa0200     lea     rdi, [rel guard_variable_for_godot...::set_visible(bool)::_gde_method_bind]
000470e3  e870160200         call    ___cxa_guard_release
000470e8  e953ffffff         jmp     0x47040

000470ed  e872160200         call    ___stack_chk_fail

The assembler has re-ordered the static method_bind branch, lookup and assignment to the end of the subroutine, so at a slightly higher level IL logically this ends up looking something like this:

Screenshot 2024-10-16 at 4 47 17 PM

The first thing i notice about the implementation here is that the hot path is already heavy... There are 4 calls and the construction and destruction of a string name (global lock). the assembly for the if statement is below:

Screenshot 2024-10-16 at 5 00 06 PM

So the hot path is already slow (relatively speaking) so there's really no reason to inline all this for every single function in godot-cpp. Replacing this:

static GDExtensionMethodBindPtr _gde_method_bind =  gdextension_interface_classdb_get_method_bind(ZIPReader::get_class_static()._native_ptr(), StringName("open")._native_ptr(), 166001499);

With something say like this:

static GDExtensionMethodBindPtr _gde_method_bind = GlobalHelperFunction(this, "open", 166001499);

Something like the above should help wrt binary boat, and have effectively no additional performance cost as the only additional cost is a single additional level of indirection (the first hit is always slow already). The static class pointer lookup, the string name construction and destruction are all the exact same execution path for all these checks and could be reworked to a global helper function instead of being inlined everywhere.

I'll try to work together a simple PR to evaluate if there are any wins here.

jordo commented 1 month ago

OK so quick result of refactoring some of the generated code to call a single common global helper function to retrieve the GDExtensionMethodBindPtr resulted in ~15% file size savings for call heavy classes. Results for canvas_item.cpp:

Current size: 3.1% 58.5Ki 5.1% 49.4Ki /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/canvas_item.cpp

Updated size:

2.8%  49.2Ki   4.4%  40.8Ki    /Users/jordanschidlowsky/WINTERPIXEL/PROJECTS/ballz/godot-cpp/gen/src/classes/canvas_item.cpp

So 58.5Ki -> 49.2Ki with a small refactor here.

Existing Generator:

static GDExtensionMethodBindPtr _gde_method_bind = internal::gdextension_interface_classdb_get_method_bind({class_name}::get_class_static()._native_ptr(), StringName("{method["name"]}")._native_ptr(), {method["hash"]});'

New generator and global helper function, takes const char* and int64_t:

static GDExtensionMethodBindPtr _gde_method_bind = GDExtensionBinding::get_method_bind_static_helper("{method["name"]}", {method["hash"]});'

Generated asm before, higher level IL:

Screenshot 2024-10-16 at 7 58 04 PM

New asm, higher level IL:

Screenshot 2024-10-16 at 7 58 25 PM

Loading the const char* and magic number hash, end up encoding to two instructions:

Screenshot 2024-10-16 at 8 02 04 PM
Faless commented 1 month ago

Compiling the simplest of extensions, a single file single including a single header results in a 2MB library. Example:

I tried compiling that extension (plus the proper registration, see the attached project) using godot-cpp 4.3 branch (1cce4d15abc3afb22724f9bf083ed7769330b43e), and emcc 3.1.64 using the emsdk installation:

# emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.64 (a1fe3902bf73a3802eae0357d273d0e37ea79898)
# wasm32-clang --version
clang version 19.0.0git (https:/github.com/llvm/llvm-project 4d8e42ea6a89c73f90941fd1b6e899912e31dd34)
Target: wasm32

This resulted in binaries way smaller than the 2MiB mentioned in the OP:

# scons platform=web
872K    project/bin/libgdexample.web.template_debug.wasm32.wasm

# scons platform=web target=template_release
876K    project/bin/libgdexample.web.template_release.wasm32.wasm

I also tried building with a dedicated build profile (also attached in the project):

# scons platform=web build_profile=build_profile.json 
184K    project/bin/libgdexample.web.template_debug.wasm32.wasm

# scons platform=web target=template_release build_profile=build_profile.json 
184K    project/bin/libgdexample.web.template_release.wasm32.wasm

This is as much as I can do to test these claims, given you did not provide a full MRP, nor the relevant toolchain information.

MRP (to be extracted as godot-cpp subfolder, or you should adjust the godot-cpp include to match the proper location): wasm-test.zip

wasms.zip

Note: Reported results are for threads builds, but nothreads results in similar sizes.

EDIT:

Since in the OP you mentioned godot-cpp 4.4-dev I'm re-running the builds with current mater (a98d41f62bdb8b7aa903e8e37c1faa48fe8fdae8). I wanted to be able to run the wasm files and didn't have a 4.4-dev build handy).

Here are the results:

Regular:

1,1M    project/bin/libgdexample.web.template_debug.wasm32.wasm
1,1M    project/bin/libgdexample.web.template_release.wasm32.wasm

Profile:

228K    project/bin/libgdexample.web.template_debug.wasm32.wasm
228K    project/bin/libgdexample.web.template_release.wasm32.wasm
dsnopek commented 1 month ago

@jordo Thanks for creating this issue! Finding ways to reduce the binary size of extensions, especially for the web or mobile, is a worthy goal :-)

Replacing this:

static GDExtensionMethodBindPtr _gde_method_bind =  gdextension_interface_classdb_get_method_bind(ZIPReader::get_class_static()._native_ptr(), StringName("open")._native_ptr(), 166001499);

With something say like this:

static GDExtensionMethodBindPtr _gde_method_bind = GlobalHelperFunction(this, "open", 166001499);

A change like this makes sense to me.

Something else @Faless and I discussed while we were at GodotCon was the possibility of replacing the function name with a hash. In this case, it probably wouldn't make much difference (if we used uint32_t for the hash, that'd be 4 bytes, which is the same as "open" not counting the null terminator), but for a class like RenderingServer it could make a bigger difference. However, this would require changes in upstream Godot, so we'd need to measure if the difference would really be worth it before going through all that trouble.

jordo commented 1 month ago

@Faless Sorry, can you build and post your wasm artifacts with debug symbols? I would like to take a look at the dissasembly for those. Specifically the two you posted at the end of your comment:

jordo commented 1 month ago

This resulted in binaries way smaller than the 2MiB mentioned in the OP:

The 2MiB size i mentioned in the OP is a mach-o universal binary, so it includes two architectures. A single architecture build for mach-o or targetting other ASM formats results in a ~1MB file which is similar to what you have posted.

Faless commented 1 month ago

~So @dsnopek there seem to be a considerable size regression between 4.3 and 4.4 (see my updated comment above).~ See below, WASM-only, due to WASM bigint support (which is required for proper compatibility with uint64_t)

One thing to notice, is that unlike Godot itself, we don't build with optimize=size by default when building for Web, so I guess that should also be tested (I'll run a whole bunch of new builds to test with that option)

dsnopek commented 1 month ago

@Faless Thanks for catching that!

As just a random guess, I wonder if the TypedDictionary changes are the cause of the size regression? They also led to a build time regression - see https://github.com/godotengine/godot-cpp/issues/1610

Faless commented 1 month ago

Results with optmize=size (4.4-dev, a98d41f62bdb8b7aa903e8e37c1faa48fe8fdae8):

Regular:

# scons platform=web target=template_debug optimize=size
960K    project/bin/libgdexample.web.template_debug.wasm32.wasm
# scons platform=web target=template_release optimize=size
956K    project/bin/libgdexample.web.template_release.wasm32.wasm

Build profile:

216K    project/bin/libgdexample.web.template_debug.wasm32.wasm
212K    project/bin/libgdexample.web.template_release.wasm32.wasm
Faless commented 1 month ago

As just a random guess, I wonder if the TypedDictionary changes are the cause of the size regression? They also led to a build time regression - see #1610

@dsnopek well, nevermind, it seems that the regression is due to #1603 (which is needed), and in fact I can't reproduce it on Linux. I guess emscripten just adds more stuff to handle (u)int64_t via BigInts.