tomlokhorst / language-cil

Manipulating Common Intermediate Language AST in Haskell
Other
20 stars 8 forks source link

Unaligned prefix #2

Closed dmcclean closed 13 years ago

dmcclean commented 13 years ago

Tom, Here's my attempt at supporting the unaligned instruction prefix. I tried to follow the style and such, but let me know if there's something you want done differently. It's a bit weird that the supportsUnaligned helper in Build.hs case-analyzes over so many things. I might have done:

supportsUnaligned oc = oc `elem` supported
                               where supported = [Ldind_i, ...]
supportsUnaligned (Ldfld _ _ _ _) = True
-- more cases like that for others that have parameters
supportsUnaligned _ = False

But there's no Eq instance for OpCode and I figured it would be too disruptive to add one.

tomlokhorst commented 13 years ago

Hi dmcclean,

The commit looks fine at first glance, it indeed follows the style of the rest of the code nicely. I hope to merge in the changes later today, once I've figured out how to do that with github.

dmcclean commented 13 years ago

I think I accidentally added some of the unrelated commits I did after the unaligned business. I have a bunch more small ones adding this and that instruction, and a big one adding all of the array manipulation instructions, but we should probably look at those separately.

tomlokhorst commented 13 years ago

I've merged in your first commit.

Also, I've renamed the opcode with the nested opcode to UnalignedPtr and added a single instruction Unaligned (commit). This is similar to the way the Tail and Tailcall instructions work. In practice I like to use Tailcall, but I think it better reflects CIL to have a single, stand-alone, Tail instruction. This allows, for example, to parse (incorrect) CIL, where a Tail (or Unaligned) instruction exists on its own.

tomlokhorst commented 13 years ago

Btw, your other commits all look fine. If you think they're good, I'll also merge them.

Unless you plan on adding more instructions in the near future, I'll push a new release to Hackage.

dmcclean commented 13 years ago

Sounds good. All the other commits are good in my book.

The only other one I need in the near term is ldtoken, but it's a bit thorny because the current way of representing type (and hence method, and even field) names is incomplete. Having an AssemblyName and a TypeName like there are in a bunch of places right now gets you the common cases, but there are a bunch more: array types (single and multi-dimensional), pointer types, and nested types. Compare the grammar in Syntax.PrimitiveType with the grammar in Partition II, section 7.1.

It seems like many of the places that presently have an AssemblyName together with a TypeName (and possibly a method or field after that) need to instead be a slightly-beefed-up PrimitiveType (which isn't really so primitive, since it includes all the cases and not just the simple pre-defined ones). The ldtoken support I need isn't useful to me without that feature, but since it is a pretty disruptive change I was wondering where you were planning on going with it to decide if I should just go do it or if you'd like me to contribute it back.

(I'm fine with ignoring the pinned, managed pointer, typedref, modopt, and modreq productions in II, 7.1. They cover obscure things for which I have no need, and especially no short-term need. I also don't actually need the field flavor of ldtoken instructions, but after I've done the other two it's very little extra work.)

Examples for concreteness: valuetype [mscorlib]System.Environment/SpecialFolder class [mscorlib]System.Predicate[]

tomlokhorst commented 13 years ago

Commit merged, for further discussion of AssemblyName / TypeName, see https://github.com/tomlokhorst/language-cil/issues/issue/5

dmcclean commented 13 years ago

The problem with renaming to supportsPrefix is that there are other prefixes support by different subsets.

Tail, for one, but I could see just ignoring that for this purpose, but there is more thorniness.

Volatile doesn't exactly match unaligned. Volatile is valid before ldsfld and stsfld instructions, but unaligned is not (presumably because the rules already require the runtime to locate static fields at aligned addresses) (see Partition III, Section 2.6).

Constrained is an important one for any generic code, and it applies to callvirt instructions only (unlike tail, which is also valid before call and calli instructions).

There's another prefix they made when they introduced generics named readonly, which is valid only before ldelema instructions which gets around the need to introduce a type check at a certain part of the intersection between arrays and generic callvirts. I don't quite understand it, which leaves me not in a position to say whether it is as important as constrained or not.

There's also a "no" prefix allowing the runtime to skip null-pointer checks, type checks, and index-in-range checks, obviously at the cost of giving up verifiability. This is theoretically interesting -- a Haskell-to-CIL (or your-functional-language-of-choice-to-CIL) compiler might be in a position to prove the validity of its non-null-ness and index-in-range requirements -- it seems from a few google searches that the both the MS JIT and the mono JIT ignore these prefixes.

Counter-proposal: supportsUnalignedPrefix = ... as it is now ...

supportsVolatilePrefix (Ldsfld _) = True
supportsVolatilePrefix (Stsfld _) = True
supportsVolatilePrefix oc = supportsUnalignedPrefix oc

supportsTailPrefix = ... -- cases for call, calli, callvirt

and so forth