mthom / scryer-prolog

A modern Prolog implementation written mostly in Rust.
BSD 3-Clause "New" or "Revised" License
2k stars 117 forks source link

Compiling and running scryer as a WebAssembly binary? #615

Open rla opened 4 years ago

rla commented 4 years ago

Hello @mthom! Would it be possible to compile and run scryer as a WebAssembly binary? I worked some time ago on SWI-Prolog WebAssembly port but it turned out to be rather difficult due to:

Also, it was quite inconvenient to provide API for JavaScript. It takes hundreds of lines of handwritten code to bind JS functions to low-level FLI, a SWI-Prolog's FFI.

More info: https://github.com/SWI-Prolog/roadmap/issues/43

It seems like Rust has some support for WebAssembly (https://developer.mozilla.org/en-US/docs/WebAssembly/Rust_to_wasm).

What do you think:

cduret commented 4 years ago

I have compiled scryer without signal (nix package) under windows OS

tniessen commented 4 years ago

Would it be possible to build scryer as a WebAssembly binary?

I did that one or two months ago, based on c342d18f926a5e5778204e35f80e44844482ae45. It seems possible, but it certainly isn't trivial.

Compilation problems

  1. The rug dependency doesn't work with WebAssembly. I exchanged a few emails with @tspiteri, and it seemed challenging to cross-compile GMP. Workaround: Remove the rug dependency and change the default features to ["num"], which uses num-rug-adapter.
  2. I couldn't get the crossterm crate to work, which is not really a surprise given the lack of a terminal in WebAssembly, but there might be a solution that I missed. Workaround: Remove crossterm and adapt get_single_char in src/prolog/machine/system_calls.rs to use an imported function raw_mode_single_char instead (which needs to be implemented in JavaScript):

    extern "C" {
      pub fn raw_mode_single_char() -> u32;
    }
    
    pub fn get_single_char() -> char {
      unsafe {
        return char::from_u32(raw_mode_single_char()).unwrap();
      }
    }
  3. ProcessTime won't work in WebAssembly. Workaround: Replace or remove ProcessTime::now() in src/prolog/machine/system_calls.rs.
  4. Signal handling doesn't work in WebAssembly. Workaround: Remove signal handling logic in src/main.rs. (Probably not necessary if you are building the project as a library instead of an executable binary.)

Cross-compiling

I don't remember the exact command I used to compile it, but I used rustup's nightly toolchain for the wasm32-wasi target.

Syscalls in the browser

I implemented necessary syscalls to an extent that allowed to load a prolog source file from a virtual in-browser file system, and to see output written to the stdout file descriptor. Apart from that, I left most of the wasi syscalls unimplemented, but someone with more time could probably get most of them to work.

Result

I got this tiny "Hello world" to work eventually:

:- initialization(hello_world).

hello_world :-
    write('Hello, World!'), nl.

Partial screenshot of the web page (with some debugging info):

prolog-webassembly

I didn't try much more after that, this had already taken me many hours and I mostly did it to gain some experience with Rust → WebAssembly.

rla commented 4 years ago

@tniessen, thank you! That's awesome. Despite these issues, it seems like porting scryer would be easier.

@tniessen, do you happen to still have your changes somewhere? It would be a nice starting point.

tniessen commented 4 years ago

Unless using threads, terminal input cannot work anyway since the blocking read call would also block the browser event loop including input events to read user input. SWI-Prolog wasm version has exactly the same issue.

@rla It might be possible to avoid this in some browsers with only minimal modifications in scryer-prolog. I wrote synchronous-channel a while back to allow WebAssembly in worker threads to retrieve information (such as terminal input) from the main thread. I did not test the library in browsers, but it should work. So the idea would be for the main thread to write terminal input to a SynchronousChannel, and a worker thread running the WebAssembly code could read it from the SynchronousChannel. The library supports both blocking and non-blocking read and write operations.

do you happen to still have your changes somewhere? It would be a nice starting point.

I'll try to find and commit them, but it might take a few days.

jacobfriedman commented 3 years ago

Very interested in a wasm module with TCP/UDP bindings. I'm trying to build the SWI-Prolog WASM just to see where I can find those calls... any idea if I'll be able to sniff those calls out with a nodeJS WASI wrapper?

jacobfriedman commented 3 years ago

@tniessen when you're ready to dedicate some time to get a WASM going, I'm right here with @rla . I'd like to get an environment up... 'one to bind them all' with prolog - scryer's the most lean implementation I've seen. Please let me know if you're willing to work together :)

tniessen commented 3 years ago

Sorry @jacobfriedman and @rla, I am super busy with university, OSS, and everything else right now, but it's still on my list. I'd love to collaborate on this, and I'll try to find time soon-ish, I hope.

tniessen commented 3 years ago

@jacobfriedman @rla I am trying to get it to work again but recent changes have only made it more difficult.

I am not sure why there are so many different crypto libraries in the list of dependencies.

TCP (the sockets library) won't work in WebAssembly, assuming it's supposed to run in a browser.

jacobfriedman commented 3 years ago

Well, a red flag for me: "I am not sure why there are so many different crypto libraries in the list of dependencies." @mthom is there a workaround?

What I've been seeing is that many libraries just wrap C instead of implementing in the native language. I wonder if these dependencies can be replaced with native (rust) drop-ins.

Are you using emscripten to compile, or compiling straight to WASM?

triska commented 3 years ago

Thank you a lot for resuming work on this project!

The cryptographic crates are used to support various cryptographic predicates in library(crypto), and also TLS connections in library(sockets).

The big bouquet of crates is needed because none of the existing crates covers all the functionality that library(crypto) exposes. Some of these crates are more convenient to use than others, and all of them have unique functionality that only they provide, at least in this simplicity.

If you do not need this functionality, you can, in your WASM branch, simply remove all cryptographic predicates from system_calls.rs, starting from https://github.com/mthom/scryer-prolog/blob/10e92eec32ac043fb9e398482ae20f9921c7eee4/src/machine/system_calls.rs#L4749 up to and including https://github.com/mthom/scryer-prolog/blob/10e92eec32ac043fb9e398482ae20f9921c7eee4/src/machine/system_calls.rs#L5197

and then simply remove the inclusion of these crates. This could be useful to start with this project. If at all possible, it would be great to preserve the cryptographic functionality as far as practical also in the WASM port.

Please let me know if you have any questions about this.

jacobfriedman commented 3 years ago

Thank you for the help! @tniessen will you take the wheel on this, as I'm relocating this week? If so please fork/branch & post the URL here :)

triska commented 3 years ago

One additional remark: If you plan to get one of the cryptographic crates working, I suggest to focus on ring. This is the most useful and well designed of the cryptographic crates, and porting it would enable a lot of the cryptographic functionality, most notably secure hashes, encryption and authentication.

The other crates are used for more rarely used hashes, and low-level reasoning over elliptic curves. I plan to get rid of specifically the openssl dependency as soon as a native Rust crate provides the needed functionality for elliptic curves.

infogulch commented 3 years ago

simply remove all cryptographic predicates from system_calls.rs ...

and then simply remove the inclusion of these crates

I wonder if this would be a good use-case for a cfg flag (crypto perhaps?), enabled by default, that if disabled omits the crypto-related system calls and their dependencies. This way, if the project is being built for WASM the cfg flag could simply be disabled.

tniessen commented 3 years ago

@tniessen will you take the wheel on this, as I'm relocating this week?

Sure, I'll try to get a minimal version to work.

jacobfriedman commented 3 years ago

Cool. I can work on it first thing next week. Also- are you working with emscripten?

On Thu., Apr. 8, 2021, 1:06 p.m. Tobias Nießen, @.***> wrote:

@tniessen https://github.com/tniessen will you take the wheel on this, as I'm relocating this week?

Sure, I'll try to get a minimal version to work.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mthom/scryer-prolog/issues/615#issuecomment-815990637, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFKHXWQRJBPYMWT2E5N6NDTHXPBZANCNFSM4OJ3LNXQ .

tniessen commented 3 years ago

@jacobfriedman No, I am trying to get it to work with cargo wasi. Emscripten might come in handy when we need to provide the WASI environment in the browser, but I'd personally prefer a "clean" build without Emscripten first.

tniessen commented 3 years ago

Oh, and best of luck on your move :)

jacobfriedman commented 3 years ago

Thanks!

Ok, good to know. I'll start digging into the docs.

On Thu., Apr. 8, 2021, 1:26 p.m. Tobias Nießen, @.***> wrote:

Oh, and best of luck on your move :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mthom/scryer-prolog/issues/615#issuecomment-816004157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFKHXT5QLPOHRBGWQQUZB3THXRNNANCNFSM4OJ3LNXQ .

tniessen commented 3 years ago

I made the crypto and terminal dependencies optional, but getting rid of slice-deque seems like a bigger challenge here. That will probably become a large diff.

jacobfriedman commented 3 years ago

There must be a better way than a huge diff. In my past experiences the problems always trickled down over time.

Maintainers, what do you say?

On Fri., Apr. 9, 2021, 9:57 a.m. Tobias Nießen, @.***> wrote:

I made the crypto and terminal dependencies optional, but getting rid of slice-deque seems like a bigger challenge here. That will probably become a large diff.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mthom/scryer-prolog/issues/615#issuecomment-816701523, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFKHXWMX2FZ7ULKRAPQHKLTH4BU7ANCNFSM4OJ3LNXQ .

tniessen commented 3 years ago

There must be a better way than a huge diff.

If I figure out how to redefine the sdeq! macro and ~if this crate doesn't actually use any properties of SliceDeque that aren't provided by VecDeque~ (it does), it might be quite small. (I'm very new to Rust, so we'll see.)

Edit: the APIs of SliceDeque and VecDeque have subtle differences and are not interchangeable. make_contiguous can be used to access a VecDeque as a slice. I suppose the only way to solve this without causing an even bigger mess is to define another Deque implementation that either uses SliceDeque or VecDeque internally, but I'm struggling with the implementation.

triska commented 3 years ago

Is it worth trying to ask the maintainers of the affected crates for the implementation of the needed functionality to simplify this change? You can link to this issue to explain the motivation for the needed features.

jacobfriedman commented 3 years ago

Ah, the tangled web of dependency maintenance. Who's going to maintain the maintainers?

@triska, I remember in one of your PoP videos you spoke of depending on systems for not months, not years, but decades. I left the node/npm ecosystem because of careless package dependency.

How is it possible to lessen the dependencies so the installation is unaffected by external forces?

tniessen commented 3 years ago

Is it worth trying to ask the maintainers of the affected crates for the implementation of the needed functionality to simplify this change?

According to the maintainers, SliceDeque cannot work in WebAssembly because of the assumptions SliceDeque makes about memory management within the host system.

We can replace SliceDeque with VecDeque for cross-compilation to WebAssembly, but the APIs are not compatible. For example, the return type of ::remove is different, and, because VecDeque isn't backed by a slice, it doesn't deref into a slice, so the only way to access it as a slice is to call make_contiguous first.

It would be great if the APIs were compatible, but that seems impossible at this point, and would mean breaking changes for the maintainers of those crates.

So the best option might still be our own Deque implementation that only provides the features we need, and that either uses SliceDeque or VecDeque internally. That's the approach I went with so far, but I am struggling with a proper implementation. (Also, it might end up being super slow in WebAssembly.)

mpabst commented 3 years ago

@tniessen et al: Would you like some help finishing this? I'm interested in a good WASM Prolog too and have free time to contribute.

triska commented 2 years ago

@mpabst: I only want to add that I would greatly appreciate you looking into targeting WASM, and please let me know if I can help in any way by ensuring that the cryptographic libraries can be ported! It seems that simply outlining the steps and issues you encountered would already be a great contribution! Thank you a lot!

triska commented 2 years ago

@tniessen: Please have a look at the latest developments in the rebis-dev branch:

https://github.com/mthom/scryer-prolog/tree/rebis-dev

Notably, db20cda27aff6a13dc7f60b87549359b59b36f6e changes many occurrences of SliceDeque to VecDeque! Does that help for the WASM port?

triska commented 2 years ago

@tniessen: Please take a look at the latest commit, 68b4951bc97d3f6e7abc4d97b69dde519196b21e, which removes all remaining occurrences of SliceDeque! I hope it makes the WASM port a lot more feasible?

triska commented 2 years ago

Update: I have filed #1600 to eliminate the OpenSSL dependency. I hope this helps with the WASM port?

triska commented 1 year ago

As of #1907, the dependency on rug is gone. I hope this helps with the WASM port?

rujialiu commented 1 year ago

Hi all,

I've just tried today using the latest commit, both pure wasi (using cargo build --target wasm32-wasi) and the web (using wasm-pack build --target web). Thanks to @triska, there are only few "blocking" crates left:

Since my own goal is to run my constraint programming codes (which don't use ffi, no networking and don't need terminal) in the browser and a pure wasi environment, I would start by simply comment out irrelavent parts of code. Any suggestions would be appreciated!

rujialiu commented 1 year ago

Took a deeper look. It looks ok to restrict tokio to features that support wasm:

tokio = { version = "1.30", features = ["sync", "macros", "io-util", "rt", "time"] }

But wasm32-wasi target still doesn't compile because tokio::runtime::Runtime::new() is wrapped in cfg_not_wasi. Looking at mock_wam.rs, I changed it to tokio::runtime::Builder::new_current_thread() then it at least compiles (but don't know whether there are better ways).

I've commented uncompilable codes using ctrlc, ffi, tls, hyper, reqwest etc, but I got the following error:

error[E0080]: evaluation of constant value failed
  --> src\atom_table.rs:23:1
   |
23 | const_assert!(mem::size_of::<Atom>() == 8);
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ attempt to compute `0_usize - 1_usize`, which would overflow
   |
   = note: this error originates in the macro `const_assert` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0080]: evaluation of constant value failed
   --> src\arena.rs:772:1
    |
772 | const_assert!(mem::size_of::<AllocSlab>() == 16);
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ attempt to compute `0_usize - 1_usize`, which would overflow
    |
    = note: this error originates in the macro `const_assert` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0080]: evaluation of constant value failed
   --> src\machine\machine_indices.rs:147:1
    |
147 | const_assert!(std::mem::align_of::<CodeIndex>() == 8);
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ attempt to compute `0_usize - 1_usize`, which would overflow
    |
    = note: this error originates in the macro `const_assert` (in Nightly builds, run with -Z macro-backtrace for more info)

That's because these three types have different size in wasm32 because wasm32 is 32bit:

#[derive(Clone, Copy, Debug)]
struct AllocSlab {
    next: *mut AllocSlab, // 4 bytes for wasi32
    header: ArenaHeader,
}

#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct Atom {
    pub index: usize, // 4 bytes for wasi32
}

#[derive(Debug)]
pub struct TypedArenaPtr<T: ?Sized>(ptr::NonNull<T>); // 4 bytes for wasi32

After I corrected the const_assert!s, I got these:

error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
   --> src\macros.rs:182:44
    |
182 |         HeapCellValue::from_bytes(unsafe { std::mem::transmute($ptr) })
    |                                            ^^^^^^^^^^^^^^^^^^^
    |
   ::: src\forms.rs:746:35
    |
746 |             Number::Integer(n) => typed_arena_ptr_as_cell!(n),
    |                                   --------------------------- in this macro invocation
    |
    = note: source type: `*const arena::ArenaHeader` (32 bits)
    = note: target type: `[u8; 8]` (64 bits)
    = note: this error originates in the macro `untyped_arena_ptr_as_cell` which comes from the expansion of the macro `typed_arena_ptr_as_cell` (in Nightly builds, run with -Z macro-backtrace for more info)

Here is the code triggering the error:

#[derive(Debug, Copy, Clone)]
pub enum Number {
    Float(OrderedFloat<f64>),
    Integer(TypedArenaPtr<Integer>),
    Rational(TypedArenaPtr<Rational>),
    Fixnum(Fixnum),
}

impl ArenaFrom<Number> for HeapCellValue {
    #[inline]
    fn arena_from(value: Number, arena: &mut Arena) -> HeapCellValue {
        match value {
            Number::Fixnum(n) => fixnum_as_cell!(n),
            Number::Integer(n) => typed_arena_ptr_as_cell!(n),
            Number::Float(OrderedFloat(n)) => HeapCellValue::from(float_alloc!(n, arena)),
            Number::Rational(n) => typed_arena_ptr_as_cell!(n),
        }
    }
}

The macro typed_arena_ptr_as_cell! calls HeapCellValue::from_bytes(unsafe { std::mem::transmute($ptr) }) but unfortunately, HeapCellValue is 64-bit but TypedArenaPtr is 32-bit in wasm32.

This issue starts from Nov 2021 in the commit use new heap term representation from @mthom, so previous wasm-building attempts didn't see this.

Is there a reliable (and relatively simple) way to address this problem?

triska commented 1 year ago

A WAM memory cell in Scryer must have 8 octets. On 64-bit platforms, this corresponds to usize. But on 32-bit platforms, usize corresponds to 4 octets and this breaks internal assumptions about the size of memory cells of Scryer's WAM implementation.

I think conceptually, the necessary changes are thus simple, even though they may require comparatively many (small) textual changes in the source code: Systematically change usize to u64 throughout for memory cells and types that are supposed to occupy exactly one cell.

This was also discussed and suggested previously: https://github.com/mthom/scryer-prolog/issues/1509#issuecomment-1152287964

An example of such a use of usize is in:

https://github.com/mthom/scryer-prolog/blob/2e811de0f5b4ddc09a50660df50105ecdba7a9f6/src/atom_table.rs#L19

On 32-bit systems, where usize corresponds to u32 and hence 4 octets (instead of 8), this breaks the assertion:

https://github.com/mthom/scryer-prolog/blob/2e811de0f5b4ddc09a50660df50105ecdba7a9f6/src/atom_table.rs#L22C24-L22C24

If u64 were used throughout, then Rust would compile Scryer as intended on all platforms, including 32-bit systems, to use 8 octets for each cell and related type. One additional complication is that several API calls, including the Rust Standard Library, expect values of type usize.

rujialiu commented 1 year ago

Thank you very much @triska ! I'll try to change usize ot u64 systematically, but I'm still unsure what to do with TypedArenaPtr. Currently it's ptr::NonNul<T>, but looking at UntypedArenaPtr, it's already a "hand-crafted" 64-bit so maybe it's less error-prone to rewrite TypedArenaPtr to mimic UntypedArenaPtr? Are there any things I need to especially careful?

rujialiu commented 1 year ago

Fixing struct Atom seems easy, only < 10 lines in atom_table.rs and one place in static_string_indexing.rs used to generate out/static_atoms.rs:

    let indices = (0..visitor.static_strs.len()).map(|i| (i << 3) as u64);

And it looks like skip_max_list in system_call.rs and default() of BranchNumber defined in disjuncts.rs should also be changed (1usize -> 1u64).

However, I'm a bit lost with those arena pointers and HeapCellValue's. There are many mem::transmute calls and from_bytes()/into_bytes() pairs which I don't feel like touching without asking. Here's what I've observed:

It looks like HeapCellValue should remain 64-bits, but what about area pointer's? Should they be 64-bits or 32-bits (then UntypedArenaPtr will only have 29-bits of ptr?).

It's very tempting to keep the definition of TypedArenaPtr to keep it "compatible with raw 32-bit pointers" but I feel like that will make type coersions harder and may cause other serious problems, so I guess the preferred thing to do is keep all these types 64-bit and implement `TypedArenaPtr as a struct raw pointer and a 32-bit padding?

Another small issue: should AllocSlab also keep 64-bits? If yes, where to insert the 32-bit paddings? before header?

#[derive(Clone, Copy, Debug)]
struct AllocSlab {
    next: *mut AllocSlab,
    header: ArenaHeader,
}

This might be the last obstacle because once I replaced all the mem::transmute with some compilable codes, the whole scryer-prolog compiled successful (both wasi and web).

BTW: I'm a beginner in prolog and haven't read WAM spec. Do I need to read wambook to tackle this problem, or this is more specific to scryer-prolog's implementation.

mthom commented 1 year ago

Hi @rujialiu , I'll try to answer some of your questions.

  • TypedArenaPtr defined in arena.rs as pub struct TypedArenaPtr<T: ?Sized>(ptr::NonNull): typed pointer allocated from areana.

This type should not need to change. T is one of the types satisfying the ArenaAllocated trait. The arena exists apart from the heap. HeapCellValue cells are only in the heap and stack. The Cons tagged cells point into the arena. As you've noted, they're often converted to ConsPtr and UntypedArenaPtr values.

  • UntypedArenaPtr defined in types.rs as a custom struct with 61-bits of ptr, a mark bit m and 2-bits of padding (don't know why ptr is not 63-bits), which allows mem::transmuteing between const ArenaHeader and const IndexPtr.

The reason UntypedArenaPtr has a 61-bit ptr field is because all ArenaAllocated types are aligned to 8-byte boundaries, freeing the marking and forwarding bits for use by the future GC (these are the first two bits of padding, the third bit of padding is for the 0 as the final tag bit, all other tags have 1 as their final bit, so non-Cons tags can be distinguished by the pattern matcher). On a 32-bit system, it might be possible to scale this down somehow? or use a cfg conditional compile flag to use a 29-bit ptr field to keep the 8-byte alignment (or 4-byte alignment? I'm not sure if that would also scale down on a 32-bit system.. most of the 90s era literature on GC I've read in the past few months suggests they were using 8-byte alignment in the 32-bit era too).

  • It looks like HeapCellValue should remain 64-bits, but what about area pointer's? Should they be 64-bits or 32-bits (then UntypedArenaPtr will only have 29-bits of ptr?).

Arena pointers should use the full address space of their host system architecture, so, 64 bits (again, 8-byte aligned so we get the three 0 bits) on 64-bit systems, 32 bits on 32.

BTW: I'm a beginner in prolog and haven't read WAM spec. Do I need to read wambook to tackle this problem, or this is more specific to scryer-prolog's implementation.

None of this is part of the WAM architecture, so no, you don't need to read the wambook.

rujialiu commented 1 year ago

Thanks to @mthom, I've managed to make it compiled successfully. Here is a summary of what I've done (besides commenting out unwanted and uncompilable codes):

 #[derive(Copy, Clone, Debug, PartialEq, Eq)]
 pub struct Atom {
-    pub index: usize,
+    pub index: u64,
 }

     #[inline(always)]
     pub fn is_static(self) -> bool {
-        self.index < STRINGS.len() << 3
+        (self.index as usize) < STRINGS.len() << 3
     }

     #[inline(always)]
@@ -165,19 +165,19 @@ pub fn as_ptr(self) -> *const u8 {
         if self.is_static() {
             ptr::null()
         } else {
-            (get_atom_tbl_buf_base() as usize + self.index - (STRINGS.len() << 3)) as *const u8
+            (get_atom_tbl_buf_base() as usize + (self.index as usize) - (STRINGS.len() << 3)) as *const u8
         }
     }

     #[inline(always)]
     pub fn from(index: usize) -> Self {
-        Self { index }
+        Self { index: index as u64 }
     }

     #[inline(always)]
     pub fn len(self) -> usize {
         if self.is_static() {
-            STRINGS[self.index >> 3].len()
+            STRINGS[(self.index >> 3) as usize].len()
         } else {
             unsafe { ptr::read(self.as_ptr() as *const AtomHeader).len() as _ }
         }
@@ -209,7 +209,7 @@ pub fn as_str(&self) -> &str {
             let ptr = self.as_ptr();

             if ptr.is_null() {
-                return STRINGS[self.index >> 3];
+                return STRINGS[(self.index >> 3) as usize];
             }

             let header = ptr::read::<AtomHeader>(ptr as *const _);
@@ -340,7 +340,7 @@ pub fn build_with(&mut self, string: &str) -> Atom {
             write_to_ptr(string, len_ptr);

             let atom = Atom {
-                index: (STRINGS.len() << 3) + len_ptr as usize - ptr_base,
+                index: ((STRINGS.len() << 3) + len_ptr as usize - ptr_base) as u64,
             };

-    let indices = (0..visitor.static_strs.len()).map(|i| i << 3);
+    let indices = (0..visitor.static_strs.len()).map(|i| (i << 3) as u64); // affects out/static_atoms.rs
     let indices_iter = indices.clone();
 #[derive(Clone, Copy, Debug)]
 struct AllocSlab {
     next: *mut AllocSlab, // 4 bytes for wasi32
+    _padding: u32, // 4 bytes
     header: ArenaHeader, // 8 bytes
 }
 impl Default for BranchNumber {
     fn default() -> Self {
         Self {
-            branch_num: Rational::from(1usize << 63),
+            branch_num: Rational::from(1u64 << 63),
             delta: Rational::from(1),
         }
     }

             if let Some(max_steps) = max_steps_n {
-                if max_steps.abs() as usize <= 1 << 63 {
+                if max_steps.abs() as u64 <= 1 << 63 {
                     if max_steps >= 0 {
                         max_old = max_steps;
                     } else {
 #[derive(Debug, Clone, Copy, Ord, Hash, PartialOrd, Eq, PartialEq)]
 pub struct CodeIndex(TypedArenaPtr<IndexPtr>);

-const_assert!(std::mem::align_of::<CodeIndex>() == 8);
+const_assert!(std::mem::align_of::<CodeIndex>() == 4);
 const_assert!(mem::size_of::<HeapCellValue>() == 8);

 #[bitfield]
-#[repr(u64)]
+#[repr(u32)] // was u64 for 64-bit system
 #[derive(Copy, Clone, Debug, Hash, PartialEq, Eq)]
 pub struct UntypedArenaPtr {
-    ptr: B61,
+    ptr: B29, // was B61 for 64-bit system
     m: bool,
     #[allow(unused)] padding: B2,
+    // HeapCellValue: val(56-bits), f(1-bit), m(1-bit), tag(6-bits)
 }

-const_assert!(mem::size_of::<UntypedArenaPtr>() == 8);
+const_assert!(mem::size_of::<UntypedArenaPtr>() == 4);

I decided that the raw pointers should be the higher 32-bits because they need tags too. So I added two auxiliary functions for HeapValueCell:

+    #[inline]
+    pub fn from_untyped_arena_ptr_bytes(ptr_bytes: [u8; 4]) -> Self {
+        HeapCellValue::from_bytes([0, 0, 0, 0, ptr_bytes[0], ptr_bytes[1], ptr_bytes[2], ptr_bytes[3]])
+    }
+
+    #[inline]
+    pub fn to_untyped_arena_ptr_bytes(self) -> [u8; 4] {
+        let bytes = self.into_bytes();
+        [bytes[4], bytes[5], bytes[6], bytes[7]]
+    }
+
     #[inline]
     pub fn to_untyped_arena_ptr(self) -> Option<UntypedArenaPtr> {
         match self.tag() {
-            HeapCellValueTag::Cons => Some(UntypedArenaPtr::from_bytes(self.into_bytes())),
+            HeapCellValueTag::Cons => Some(UntypedArenaPtr::from_bytes(self.to_untyped_arena_ptr_bytes())),
             _ => None,
         }
     }

and changed some macros accordingly:

  macro_rules! cell_as_untyped_arena_ptr {
     ($cell:expr) => {
-        UntypedArenaPtr::from(u64::from($cell) as *const ArenaHeader)
+        UntypedArenaPtr::from_bytes($cell.to_untyped_arena_ptr_bytes())
     };
 }

 macro_rules! untyped_arena_ptr_as_cell {
     ($ptr:expr) => {
-        HeapCellValue::from_bytes(unsafe { std::mem::transmute($ptr) })
+        HeapCellValue::from_untyped_arena_ptr_bytes(unsafe { std::mem::transmute($ptr) })
     };
 }

There's one thing I need confirmation for @mthom because I can't compile the following codes on 32-bit system. 64-bit system can compile because both HeapValueCell and arena pointer's are 64-bits. But it looks like in the following code, the actual parameters are ptr's, not cells:

@@ -250,13 +253,17 @@ macro_rules! match_untyped_arena_ptr_pat_body {
         #[allow(unused_braces)]
         $code
     }};
-    ($cell:ident, OssifiedOpDir, $n:ident, $code:expr) => {{
-        let $n = cell_as_ossified_op_dir!($cell);
+    ($ptr:ident, OssifiedOpDir, $n:ident, $code:expr) => {{
+        //let $n = cell_as_ossified_op_dir!($cell);
+        let payload_ptr = unsafe { std::mem::transmute::<_, *mut OssifiedOpDir>($ptr.payload_offset()) };
+        let $n = TypedArenaPtr::new(payload_ptr);
         #[allow(unused_braces)]
         $code
     }};
-    ($cell:ident, LiveLoadState, $n:ident, $code:expr) => {{
-        let $n = cell_as_load_state_payload!($cell);
+    ($ptr:ident, LiveLoadState, $n:ident, $code:expr) => {{
+        //let $n = cell_as_load_state_payload!($cell);
+        let payload_ptr = unsafe { std::mem::transmute::<_, *mut LiveLoadState>($ptr.payload_offset()) };
+        let $n = TypedArenaPtr::new(payload_ptr);
         #[allow(unused_braces)]
         $code
     }};
wasmtime --mapdir .::. target\wasm32-wasi\debug\scryer-prolog.wasm
Error: failed to run main module `target\wasm32-wasi\debug\scryer-prolog.wasm`

Caused by:
    0: WebAssembly failed to compile
    1: WebAssembly translation error
    2: Invalid input WebAssembly code at offset 3476618: locals exceed maximum

According to RustPython/RustPython#3054, I changed Cargo.toml to give some optimization in debug build:

[profile.dev]
opt-level = 1

Then I don't get the error message above but nothing comes out. Unfortunately, I'm getting busy and don't have time to look into it now.

triska commented 1 year ago

@rujialiu: Thank you so much for all this work! Could you please consider filing a pull request for these changes? They can then be tried and reviewed more easily.

rujialiu commented 1 year ago

@rujialiu: Thank you so much for all this work! Could you please consider filing a pull request for these changes? They can then be tried and reviewed more easily.

Yeah, I'd love to, though the current changes are not meant to be merged because they're 32-bit only (i.e. will break 64-bit system), and it's based on the messy codes after commented out a lot of codes. But you're right, I'll be easier to review. So I'll make a "discussion-only" PR anyway. Maybe after I made everything work, I'll try to clean them up, split them (one 32-bit specific, then one to make some features optional, then a final one to make wasm32 work).

triska commented 1 year ago

Thank you a lot! Seeing all these changes as a diff will also be of great value for all interested contributors, because it shows which areas of the code are affected and need to be adapted.

A 32-bit specific patch (so that Scryer also runs on 32-bit systems) would in itself already be an invaluable contribution, and would solve #1509! If it helps, you can introduce and use a custom data type where needed, so that the difference between 32- and 64-bit compilation is then confined to conditional (automatic) changes of this type.

rujialiu commented 1 year ago

Ok! I've made a PR. @triska It looks like there are still a lot of things left to do to make it truly usable, but I'll try :)

rujialiu commented 1 year ago

I have changed all the commented codes to use guards like cfg(feature = "ffi") and cfg(feature = "http"). I guess it's better to make a separate PR from these changes? BTW: I found there is a ring-wasi crate which works for wasi and the original ring already works with wasm32 web so I'm leaving ring a mandatory dependency, otherwise I'll write a lot more cfg guards :)

I'll hopefully push the changes tomorrow in the existing PR first.

triska commented 1 year ago

Thank you a lot, this sounds awesome! It is great to hear that ring and thus all the cryptographic functionality of Scryer Prolog will be available in the WASM build!

While you are making so fast progress, you can also simply force-push to the branch so that always the latest changes are easily visible for everyone interested in this topic! And yes, for self-contained changes that are immediately applicable and which do not cause any regressions (on any platform), please do file separate PRs if possible, so that they can be easily merged as soon as possible. I look forward to all improvements in these areas!

rujialiu commented 1 year ago

I'm made another PR for the optional features, most of my changes are there. After that is merged, the more "hardcore" 32-bit related changes will be much easier to review. There is one thing though: exactly one crypto function is implemented using sodiumoxide, which has to be marked as an optional feature. It will be great if that can be implemented with ring instead, so we can remove that dependency. Is it possible? @triska

triska commented 1 year ago

@rujialiu: Thank you so much for working on this!

Yes, it seems possible that we can use the excellent crrl crate by @pornin to get rid of the sodiumoxide dependency. crrl is currently already used by Scryer for secp256k1. I will look into this, please do leave this functionality as optional for now so that the compilation works and this work can continue. Thank you a lot!

triska commented 1 year ago

I have filed #1970 to eliminate the sodiumoxide dependency by using crrl!

rujialiu commented 1 year ago

Anyone interested in supporting wasix? It's relatively new, wasmer-only (currently), but the featureset is too attractive. Here is a list from https://wasmer.io/posts/announcing-wasix

So for example hyper should work out-of-box. But personally I don't need these in near future :)

rujialiu commented 1 year ago

I decided to tackle i686 32-bit platform before wasi because it's easier to debug :) Here is current output of cargo test:

running 24 tests

test parser::char_reader::tests::plain_string ... ok
test parser::char_reader::tests::greek_string ... ok
test parser::char_reader::tests::russian_string ... ok
test machine::stack::tests::stack_tests ... ok
test parser::char_reader::tests::armenian_lorem_ipsum ... ok
test arena::tests::float_ptr_cast ... ok
test parser::char_reader::tests::greek_lorem_ipsum ... ok
test machine::mock_wam::tests::is_cyclic_term_tests ... ok
test arena::tests::heap_cell_value_const_cast ... ok
test machine::mock_wam::tests::test_term_compare ... ok
test machine::copier::tests::copier_tests ... ok
test machine::mock_wam::tests::test_unify_with_occurs_check ... ok
test machine::partial_string::test::pstr_iter_tests ... ok
test machine::arithmetic_ops::tests::arith_eval_by_metacall_tests ... ok
test heap_iter::tests::heap_stackful_post_order_iter ... ok
test machine::mock_wam::tests::unify_tests ... ok
test heap_iter::tests::heap_stackful_iter_tests ... ok
test parser::char_reader::tests::russian_lorem_ipsum ... ok
test heap_print::tests::term_printing_tests ... ok
test atom_table::atomtable_is_not_concurrency_safe - should panic ... ok
test heap_iter::tests::heap_stackless_post_order_iter ... FAILED
test machine::gc::tests::heap_marking_tests ... FAILED
test arena::tests::heap_put_literal_tests ... FAILED
test heap_iter::tests::heap_stackless_iter_tests ... FAILED

failures:

---- arena::tests::heap_put_literal_tests stdout ----
thread 'arena::tests::heap_put_literal_tests' panicked at 'value contains invalid bit pattern for field ArenaHeader.tag: InvalidBitPattern { invalid_bytes: 0 }', src\arena.rs:167:5

---- machine::gc::tests::heap_marking_tests stdout ----
thread 'machine::gc::tests::heap_marking_tests' panicked at 'assertion failed: `(left == right)`
  left: `HeapCellValue { tag: Atom, name: "f", arity: 0, m: false, f: false }`,
 right: `HeapCellValue { tag: Atom, name: "f", arity: 2, m: false, f: false }`', src\machine\gc.rs:376:9

---- heap_iter::tests::heap_stackless_post_order_iter stdout ----
thread 'heap_iter::tests::heap_stackless_post_order_iter' panicked at 'assertion failed: `(left == right)`
  left: `Some(HeapCellValue { tag: Atom, name: "f", arity: 0, m: false, f: false })`,
 right: `None`', src\heap_iter.rs:2522:13

---- heap_iter::tests::heap_stackless_iter_tests stdout ----
thread 'heap_iter::tests::heap_stackless_iter_tests' panicked at 'assertion failed: `(left == right)`
  left: `Some(HeapCellValue { tag: Atom, name: "f", arity: 0, m: false, f: false })`,
 right: `None`', src\heap_iter.rs:540:13

failures:
    arena::tests::heap_put_literal_tests
    heap_iter::tests::heap_stackless_iter_tests
    heap_iter::tests::heap_stackless_post_order_iter
    machine::gc::tests::heap_marking_tests

test result: FAILED. 20 passed; 4 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.08s

Could @mthom or @triska make some guesses about potential causes? Since some tests already passed, but some are not, maybe you can guess from "what's different between those tests", which I'm totally clueless... Or maybe some more low-level tests that I can understand easier? Thanks a lot!

rujialiu commented 1 year ago

Getting closer! Only 3 tests failing.

It looks like UntypedArenaPtr points to an ArenaHeader (instead of a pointer to it) which is stored at the end of AllocSlab, so I changed this:

    fn header_offset_from_payload() -> usize {
        mem::size_of::<*const ArenaHeader>()
    }

to:

    fn header_offset_from_payload() -> usize {
        mem::size_of::<ArenaHeader>()
    }

However, for some reason, in target i686-pc-windows-msvc, ArenaHeader is NOT at the end of AllocSlab, so I added a #[repr(C)] to AllocSlab.

After fixing some other minor issue that I forgot, only three tests are failing: heap_stackless_iter_tests, heap_marking_tests and heap_stackless_post_order_iter.

And after skipping these test, all remaining 23 test can also pass:

running 23 tests
test src_tests::facts ... ok
test issues::handle_residual_goal ... ok
test issues::do_not_duplicate_path_components ... ok
test issues::display_constraints ... ok
test issues::occurs_check_flag ... ok
test issues::op3 ... ok
test issues::occurs_check_flag2 ... ok
test issues::compound_goal ... ok
test issues::multiple_goals ... ok
test issues::ignored_constraint ... ok
test issues::no_stutter ... ok
test src_tests::builtins ... ok
test src_tests::setup_call_cleanup_process ... ok
test src_tests::syntax_error ... ok
test src_tests::hello_world ... ok
test src_tests::clpz_load ... ok
test issues::call_0 ... ok
test src_tests::call_with_inference_limit ... ok
test src_tests::iso_conformity_tests ... ok
test src_tests::predicates ... ok
test issues::atomtable_is_not_concurrency_safe - should panic ... ok
test src_tests::setup_call_cleanup_load ... ok
test src_tests::rules ... ok

test result: ok. 23 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 40.32s

So I guess we're pretty close to getting i686-pc-windows-msvc to work! Any hunch/suggestions are welcome! For example, are there any other fixed-layout structs that needs [repr(C)] as well? @mthom @triska

BTW: It's a bit surprising that failing these 3 tests did not prevent us from succeeding these "prolog program tests" :)

rujialiu commented 1 year ago

Ah, I forgot that dashu depends on his own num-order which in turn depends on published num-modular 0.5.1 which doesn't compile on i686-pc-windows-msvc. However, Scryer-prolog's forked num-modular (which is marked as 0.6.0 internally) actually fixed it. How can I make num-order (which isn't forked) to use the new fork? (Locally I just copied 0.6.0 codes to overwrite 0.5.1 :P)