CensoredUsername / dynasm-rs

A dynasm-like tool for rust.
https://censoredusername.github.io/dynasm-rs/language/index.html
Mozilla Public License 2.0
716 stars 52 forks source link

Instruction cache not flushed after calling Assembler::alter on AArch64 #50

Open losfair opened 3 years ago

losfair commented 3 years ago

It seems that Assembler::alter (which uses mprotect) does not flush the instruction cache after code has changed, and subsequent execution sometimes still uses the old code.

This issue can only be reproduced on a real AArch64 CPU (qemu-aarch64 on x86-64 machines works fine).

Adding explicit ic ivau instructions after calling .alter fixes this.

CensoredUsername commented 3 years ago

I knew that aarch64 has incoherent instruction caches but for some reason I assumed that the relevant mprotect calls would handle those flushes already. I'll add this to the default behaviour on aarch64 👍

CensoredUsername commented 3 years ago

Do you know by any chance if there's a stable interface in the rust stdlib for performing this operation? I can't find it and I'd rather not have to drop to unsafe behaviour just for this one part of the library.

CensoredUsername commented 3 years ago

Also, for portability reasons it'll probably have to be an IC IALLU instruction for simplicity, as IC IVAU needs to be performed on a cache-line basis which for larger alter ops would get very complex.

CensoredUsername commented 3 years ago

Scratch that, I've got a cache-line granularity fix architected but I'm investigating the best way of getting this running on stable rust.

losfair commented 3 years ago

I'm not aware of any stdlib interfaces for flushing caches.

Maybe generate the cache flushing instructions with dynasm itself for stable rust?

CensoredUsername commented 3 years ago

There's a bunch of options:

CensoredUsername commented 3 years ago

I've done some preliminary work for this at 206cc7b . I'm trying to find a way to test this properly but my VM provider stopped offering aarch64 instances last month for some reason so this might take a bit longer. It should work but no promises of not just getting a SIGILL. It also may still need some dc civac instructions before the I$ invalidation, I have to recheck the docs for that. For now I'm invalidating the cache on a patch by patch basis of the alter API but it might be easier to switch to just invalidating the whole buffer when it's altered.

losfair commented 3 years ago

Looks nice to me! Will try it. Thanks!

CensoredUsername commented 2 years ago

With inline asm stabilized in rust for aarch64 this can now be implemented in a less hacky way, so I'll get back on this soon.

CensoredUsername commented 1 year ago

update: don't have the time to finish this for 2.0, but it shouldn't be a breaking change once we're in there. Postponing this for now.

MarkMcCaskey commented 1 year ago

I came across this problem on my own on an M1 mac and ended up finding out about sys_icache_invalidate which seems to be working well for me so far. I don't have enough context to compare or contrast it with ic ivau, but I'm mentioning it here in case it helps anyone in the future.