bytecodealliance / cranelift-jit-demo

JIT compiler and runtime for a toy language, using Cranelift
Apache License 2.0
648 stars 61 forks source link

How can I get type of Value? #65

Closed lechatthecat closed 2 years ago

lechatthecat commented 2 years ago

This demo is great, so I'm enjoying creating some software with this, but currently I'm struggling to get type of "Value" of cranelift. How can I check if string is inside or if number is inside Value? Or can we use this immediate instead of Value?

bjorn3 commented 2 years ago

A Value can't contain a string. It can only contain one of the integer, boolean, float, vector and reference types. You can use builder.func.dfg.value_type() to get the type of a value.

lechatthecat commented 2 years ago

@bjorn3 Thank you very much for your quick response :) I checked your suggestion, and it worked.

let lhs_type = self.builder.func.dfg.value_type(lhs);
let is_int = lhs_type.is_int(); // true

A Value can't contain a string.

Sorry for a vague question. But I meant, if every data is saved as int like in this demo, is there a way to check data type? For example, in this demo, I think String can be changed to a boxed_slice then it can be saved like in this function: https://github.com/bytecodealliance/cranelift-jit-demo/blob/main/src/jit.rs#L90 I think this data is treated as int here: https://github.com/bytecodealliance/cranelift-jit-demo/blob/main/src/jit.rs#L378

As another example of implementation, I checked lust language that uses Cranelift-jit, and it was using immediate to check the data type: https://github.com/ezekiiel/lust/blob/main/lustc/src/conversions.rs Though I am not the author, my understanding on the code above is that, it is adding some bits to a immediate, and those bits are representing data type of the immediate, so that the immediate can contain value and its data type together, even if they are always int to cranelift-jit. Cranelift-jit natively supports something like this?

I am sorry if my question is very beginner and doesn't make sense..

bjorn3 commented 2 years ago

Sorry for a vague question. But I meant, if every data is saved as int like in this demo, is there a way to check data type?

Pointers on 64bit systems are represented as i64. Cranelift doesn't store any information about what it points to. You as user of Cranelift are responsible for storing this information if you need it.

Though I am not the author, my understanding on the code above is that, it is adding some bits to a immediate, and those bits are representing data type of the immediate, so that the immediate can contain value and its data type together, even if they are always int to cranelift-jit.

I think that is pointer tagging I think. Pointer tagging is something done at runtime when it isn't known at compile time what the type of the value is.

Cranelift-jit natively supports something like this?

No, everyone does pointer tagging in their own way as everyone has their own set of types.

lechatthecat commented 2 years ago

@bjorn3 Thank you very much again for your quick response...! OK, I understand, now your answer cleared my questions. I appreciate you a lot for your actions. 🙇

lechatthecat commented 2 years ago

One idea might be, create a struct at first:

LangValue {
   type: usize,
   value: SomeValue,
}

We can change this struct to int by:

let l = LangValue {
   type: 1,
   value: value,
};
let intval = Box::into_raw(Box::new(l)) as i64;
let jit_intval = self.builder.ins().iconst(types::I64, intval);

And this jit_intval can be passed to rust functions defined in jit world like:

#[no_mangle]
pub extern "C" fn test1(ptr: *mut LangValue) {
    let r = unsafe {Box::from_raw(ptr)};
    println!("test: {:?}", r);
}

So that we can pass more information to rust functions defined in jit world even if we always use pointer_type which is actually i64 type. One warning is that we must always use Box::from_raw for values created by Box::into_raw(Box::new(l)). Otherwise it might lead to memory leak.

lechatthecat commented 2 years ago

Or using ArgumentPurpose::StructReturn or returning multiple values from a function might be good too to handle more information.