jethrogb / rust-cexpr

A C expression parser and evaluator
Apache License 2.0
45 stars 19 forks source link

Integer suffixes discarded #16

Open DK-DARKmatter opened 5 years ago

DK-DARKmatter commented 5 years ago

Generally we want to have a precise type map between C and Rust. The problem we encountered is when we try to convert a positive number in C, we want to have a signed integer, i32 or i64, but we get a u32 directly.

Reference to the issue:https://github.com/rust-lang/rust-bindgen/issues/1594

I'm not sure yet but will this be the part that generate the final result?

named!(c_int<i64>,
    map!(terminated!(alt_complete!(
        map_opt!(preceded!(tag!("0x"),many1!(complete!(hexadecimal))),|v|c_int_radix(v,16)) |
        map_opt!(preceded!(tag!("0b"),many1!(complete!(binary))),|v|c_int_radix(v,2)) |
        map_opt!(preceded!(char!('0'),many1!(complete!(octal))),|v|c_int_radix(v,8)) |
        map_opt!(many1!(complete!(decimal)),|v|c_int_radix(v,10)) |
        force_type!(IResult<_,_,u32>)
    ),opt!(take_ul)),|i|i as i64)
);
jethrogb commented 5 years ago

It's the take_ul you're interested in I think.

mbestavros commented 5 years ago

@jethrogb, I'm interested in taking this on, as the downstream issue referenced above is becoming a major blocker.

From my understanding, the issue is mainly that the c_int macro doesn't pass along the correct type information (according to the C reference here), even though that information is available.

If we were to pass that information downstream, we'd want to return more than just the raw value -- perhaps a richer return type. I was thinking a struct like this:

struct Integer {
  neg: bool,
  val: u128,
  u: bool,
  l: usize,
}

What are your thoughts? I figure if we can kick off by agreeing on a return type now, it'll make the actual programming easier.

jethrogb commented 5 years ago

That looks good, except I wonder about the compilation-target compatibility of u128.

Minor comments: since you're using the signed magnitude representation, let's name the fields appropriately. Also, I think using an enum for the type length is going to make writing operations easier.

struct Integer {
  sign: bool,
  magnitute: u128,
  unsigned: bool,
  len: IntegerLength,
}

enum IntegerLength {
    NotLong,
    Long,
    LongLong
}
npmccallum commented 5 years ago

@jethrogb We could probably just select u128 or u64 based on target support.

npmccallum commented 5 years ago

@jethrogb / @mbestavros : We should also probably have a radix field. We should also impl From<Integer> for String.

jethrogb commented 5 years ago

What is the purpose of the radix?

npmccallum commented 5 years ago

So that we can recreate the exact bytes that were used in the source. This can help in various debugging scenarios. If we're going to avoid discarding information we shouldn't discard any.

jethrogb commented 5 years ago

That doesn't really make sense, the same type needs to be able to represent both a literal and an evaluated expression. What is the radix of 10 + 0xa?

npmccallum commented 5 years ago

What is the Integer of 10 + 0xa?

Combining parsing and evaluation is bad mojo.

jethrogb commented 5 years ago

What is the Integer of 10 + 0xa?

sign: false, magnitude: 20, unsigned: false, length: IntegerLength::NotLong

Combining parsing and evaluation is bad mojo.

Feel free to rewrite the entire crate.

npmccallum commented 5 years ago

Sorry, didn't mean to come across as cheeky. We can do without the radix.