Open liunianzmj opened 9 months ago
That is strange as the exact same command line passed CI... The once_cell
crate didn't need to specify alloc
before...
I'll look into it. Perhaps a new version of once_cell
.
In the meantime you can specially include once_cell/alloc
in your dependencies.
Thank you very much, this problem has been solved, but now there is a new problem, I want to run on the 128kb memory stm32 microcontroller can be implemented? So far the following error has occurred
The standard Rhai with all libraries is too large to fit inside 128KB.
You'd want to use a minimal build. Check this out: https://rhai.rs/book/start/builds/minimal.html
It still doesn't work
Well, did you use Engine::new_raw
?
Otherwise you'd be pulling in the entire standard library which will be very large.
I would also say that, if you want to keep arrays and maps, it would probably be difficult to fit it inside 128KB, possibly together with your own code as well...
I used Engine::new_raw, the feature is also used, but it still does not work, what is the minimum memory required?
How do I get rhai to use this memory?
Hhhmmm... No idea. I'm not an embedded programmer myself...
Maybe you can search on the net... Probably you'll need a custom allocator that can allocate from specific locations.
BTW l think I'd also need to turn off memory-heavy stuff like strings interning and function resolution caching... Maybe I'll add that into Rhai.
Thank you very much. I have solved this problem, but I found that about 160KB of FLASH was added to the packaged program after adding rhai. How can I solve this problem?
about 160KB of FLASH was added to the packaged program
As I mentioned, it is quite difficult to fit inside 64K if you don't take out language features. I remember some user successfully packed it under 64K, but he had to disable arrays and objects support (no_index
and no_object
). However, that was quite a long time ago for a very old version, and much more code has been added since then.
If you want to pack under 128K, then I think it is possible...
Are you sure you're making a release build with optimization for size?
Also have you strip
your binary? Unix symbol tables can be quite huge.
Yes, I'm making an optimized size version now. How to reduce the size? Currently I have disabled no_index, no_object, but it still needs 143KB.
Have you stripped the binary?
It's stripped off but it's only 1KB too small
Try to do a strip my_bin.obj
etc. to see if it gets smaller...
Just curious how the status is.
Did you succeed to squeeze it down further?
another curious user here :smile:
Since it seems like folks are still curious, here are my initial results on a Cortex-M7 (substantially similar to the M4). I'm not sure what the original poster was trying to do and whether this will be at all helpful to them, since they're asking about code size and then highlighting an SRAM, which is not typically where code lives on these parts.
Try to do a strip my_bin.obj etc. to see if it gets smaller...
FWIW stripping the binary doesn't help it go into flash, in general, because these systems don't get the debug symbols in flash. Only the text and data initialization image is actually written.
All measurements are taken at opt-level = "z"
and lto = true
in release, on Rhai 1.18, using Engine::new_raw
with no additional packages loaded except where noted. Rust 1.77.2.
Default no_std build: 421,692 bytes
Turning on size reduction features individually:
Turning them all on gets the build down to 148,876, though it produces a language that doesn't meet my needs (no functions, for instance).
Turning back off only no_function
(I like functions), no_index
(I wanted arrays), and only_i32
(arrays of bytes, specifically): 240,008.
Adding CorePackage
to that: 432,756
Adding StandardPackage
instead: 693,800
Both get significantly smaller if you enable only_i32
but, for my use case, I kinda need u8
.
I looked into what was responsible for the significant size increase when adding CorePackage
to the image, and it turns out that the various packages' init
routines are generating really big code. A typical one consists of repeated sequences like the one below, over and over, always ending in set_into_module_raw
:
8039a38: f642 3003 movw r0, #11011 @ 0x2b03
8039a3c: e9cd b603 strd fp, r6, [sp, #12]
8039a40: f8ad 0088 strh.w r0, [sp, #136] @ 0x88
8039a44: f240 2002 movw r0, #514 @ 0x202
8039a48: f8ad 00a0 strh.w r0, [sp, #160] @ 0xa0
8039a4c: f44f 7080 mov.w r0, #256 @ 0x100
8039a50: 46da mov sl, fp
8039a52: f244 0b20 movw fp, #16416 @ 0x4020
8039a56: e9cd 440a strd r4, r4, [sp, #40] @ 0x28
8039a5a: f2c2 0b00 movt fp, #8192 @ 0x2000
8039a5e: f8cd 408a str.w r4, [sp, #138] @ 0x8a
8039a62: 2108 movs r1, #8
8039a64: f8cd 408e str.w r4, [sp, #142] @ 0x8e
8039a68: f8ad 4092 strh.w r4, [sp, #146] @ 0x92
8039a6c: f8ad 0098 strh.w r0, [sp, #152] @ 0x98
8039a70: 9425 str r4, [sp, #148] @ 0x94
8039a72: 9420 str r4, [sp, #128] @ 0x80
8039a74: 464c mov r4, r9
8039a76: f8cd 8014 str.w r8, [sp, #20]
8039a7a: f8cd 9008 str.w r9, [sp, #8]
8039a7e: f89b 0000 ldrb.w r0, [fp]
8039a82: 2004 movs r0, #4
8039a84: f7cd fd53 bl 800752e <<embedded_alloc::Heap as core::alloc::global::GlobalAlloc>::alloc>
8039a88: 2800 cmp r0, #0
8039a8a: f005 8677 beq.w 803f77c <<rhai::packages::arithmetic::ArithmeticPackage as rhai::packages::Package>::init+0x60b2>
8039a8e: f643 2134 movw r1, #14900 @ 0x3a34
8039a92: e9c0 5500 strd r5, r5, [r0]
8039a96: f6c0 0106 movt r1, #2054 @ 0x806
8039a9a: ad0a add r5, sp, #40 @ 0x28
8039a9c: e9cd 012b strd r0, r1, [sp, #172] @ 0xac
8039aa0: aa02 add r2, sp, #8
8039aa2: 9901 ldr r1, [sp, #4]
8039aa4: ab2a add r3, sp, #168 @ 0xa8
8039aa6: f04f 0803 mov.w r8, #3
8039aaa: 4628 mov r0, r5
8039aac: 46cb mov fp, r9
8039aae: f88d 80a8 strb.w r8, [sp, #168] @ 0xa8
8039ab2: f005 fe67 bl 803f784 <rhai::module::FuncRegistration::set_into_module_raw>
I haven't looked at the code generator, but this sort of thing is pretty common in programs that haven't been written with text size in mind -- my guess is that you've got a code generator producing these init
routines as a long series of unique Rust statements, instead of a compact routine driven by a table (which tends to be much smaller), or setting up the datastructures entirely at compile time so they can go into ROM (which tends to be dramatically smaller).
All in all, Package::init
routines like this account for 105,782 bytes added when including CorePackage.
So, the system as it stands can fit into larger STM32 parts (I'm building these tests for the STM32H753 with 1MiB of flash) but the codebase doesn't appear to have been written with size (or startup time) in mind. (Which is fine! You haven't claimed otherwise. But I wanted to post these numbers for the next person who tries to fit this into a small microcontroller.)
In case you're curious, here's the test program. I derived it from the no_std example.
#![no_std]
#![no_main]
extern crate alloc;
use core::{mem::MaybeUninit, ptr::addr_of};
use panic_halt as _;
use rhai::{packages::Package, Engine, INT};
use stm32_metapac as _;
use embedded_alloc::Heap;
#[global_allocator]
static HEAP: Heap = Heap::empty();
#[cortex_m_rt::entry]
fn main() -> ! {
{
const HEAP_SIZE: usize = 16384;
static mut HEAP_MEM: [MaybeUninit<u8>; HEAP_SIZE] = [MaybeUninit::uninit(); HEAP_SIZE];
unsafe {
HEAP.init(addr_of!(HEAP_MEM) as usize, HEAP_SIZE);
}
}
let mut engine = Engine::new_raw();
// this bit gets commented out to test size without Core
let std = rhai::packages::CorePackage::new();
std.register_into_engine(&mut engine);
loop {
// Evaluate a simple expression: 40 + 2
let _ = engine.eval_expression::<INT>("40 + 2").unwrap() as isize;
cortex_m::asm::nop();
}
}
Both get significantly smaller if you enable
only_i32
but, for my use case, I kinda needu8
.
That's a very interesting observation!
my guess is that you've got a code generator producing these init routines as a long series of unique Rust statements, instead of a compact routine driven by a table (which tends to be much smaller),
You're absolutely correct. That's what the code generator does: generates a bunch of individual function registration calls. And yes, they most probably can go into a table instead...
I'll experiment with that and report back.
That's a very interesting observation!
Yeah, I'm specifically looking at scripting options for doing embedded handling of arrays of bytes. I felt like no_index
and only_i32
would make that difficult -- but perhaps I don't totally understand the features (or Rhai, which I freely admit).
There is a builtin data type called Blob
which is an array of bytes.
I write device drivers with Rhai so I added that into Rhai many versions ago. Give that a spin.
error: