Open shinmao opened 5 months ago
There is a lot of unsafe stuff happening all over the place in Scryer. It's kind of scary that a lot of them are behind macros which aren't easy to debug. I contributed initial support for running the unit tests with Miri in #2281, which would make debugging such things much easier, but there are a lot of things in streams.rs
, arena.rs
and atom_table.rs
that need to be fixed before Miri can go deeper into these issues. I think most of the current Miri blockers aren't really anything wrong, but just provenance issues that make it so that Miri can't do it's job (relevant: #2019).
The macros seem to be a really foundational part of Scryer, so changing them to be safer will probably be really hard and/or introduce a big performance regression. Relevant quote from the Nomicon:
[...] unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.
Unsafe should be completely encapsulated at module level, but using unsafe in macros like Scryer does ends up putting unsafe everywhere even in modules that could and should probably be 100% safe Rust, which is a big nightmare for debugging Undefined Behavior.
What I can do here is to list out potential issues I considered to be unsafe (also related to misalignment)
machine::loader::<impl machine::Machine>::use_module
In this function, transmute
was wrapped in macro cell_as_load_state_payload
, which will convert a u8
raw pointer into raw pointer of machine::term_stream::LoadStatePayload<machine::term_stream::LiveTermStream>
in line 1472. This could lead to misalinged pointer since LoadStatePayload
has aligned bytes larger than 1 byte (e.g., The struct could be aligned to the field of retraction_info
). heap_print::HCPrinter::<‘a Outputter>::handle_heap_term
In this function, misaligned pointers were created while transmuting u8
pointer into the pointer of std::net::TcpListener
and dashu::rational::RBig
. @bakaq I agreed with your point. However, after we compile the code to MIR, all the macros have been expanded. For example, it would be clear to find that transmute
is used in the function. This method could help us debug more efficiently.
@shinmao I tried to address some of these criticisms in https://github.com/mthom/scryer-prolog/pull/2393 if you don't mind taking a look and perhaps making further suggestions for improvement.
Sorry, mentioned the wrong PR Number
Unsoundness
Hi, it seems that three are several misaligned pointer created from
transmute
are used in the library. For example, in the modulemachine::system_calls::<impl machine::Machine>::socket_server_accept/close
https://github.com/mthom/scryer-prolog/blob/9837187183cf2d7071dd94601c530afc2e280321/src/machine/system_calls.rs#L6563-L6594 At line 6593, the macromatch_untyped_arena_ptr
will matchArenaHeaderTag
and transmute theu8
raw pointer to the raw pointer ofTcpListener
, which is aligned to 4 bytes. The undefined behavior caused by the misaligned pointer dereference can lead to unexpected behaviors that we should avoid. The runtime panic when the system doesn't tolerate the misalignment can even cause the socket operation to fail.The similar issues could also occur in
machine::system_calls::<impl machine::Machine>::http_accept/http_answer