B-Lang-org / bsc

Bluespec Compiler (BSC)
Other
954 stars 146 forks source link

BSV: Pretty-print module return types when possible #666

Open RyanGlScott opened 10 months ago

RyanGlScott commented 10 months ago

The BSV pretty-printer would print a function like this:

module [Module] helloWorld#(Module#(Empty) m)(Empty);
  let e <- m;
endmodule

Without the [Module] bit, resulting in a function declaration that wouldn't typecheck. This patch makes a best-effort attempt to pretty-print this bit whenever possible. More specifically:

Fixes #663.

quark17 commented 10 months ago

Thanks!

The IsModule was being intentionally dropped, though. The following BSV code (without any IsModule proviso):

module [m] mkMod (Ifc);

is equivalent to this BH code:

mkMod :: (IsModule m _) => m Ifc

The IsModule context is implied by module..endmodule. I guess it's legal to also print it as an explicit proviso, but it would be a duplicate, and we'd like to avoid it.

Also, if the body of the module doesn't use m anywhere, then [m] can be left off, which is why this works in BSV:

module mkMod (Ifc);

and it's equivalent to the above with explicit m. Again, it doesn't hurt to mention [m], but I think in an ideal world, we'd want the pretty-printer to not include it if not needed, since probably 99% of users have never seen the [m] syntax, and we don't want to confuse them. However, I don't know if there's a good way to check for this -- but it's probably safe for your purposes to leave out [m], since it's unlikely that it's ever referenced. The more important thing is to include it when it's not a single variable (particularly when it's a constructor).

This code:

(mModTC, ity) =
  case x of
    TAp (TCon modTC) y -> (Just modTC, y)
    TAp (TVar _) y -> (Nothing, y)
    z -> (Nothing, z)

should probably be this:

(mty, ity) =
  case x of
    TAp a b -> (a, b)
    _ -> _

Because the module type could be a constructor (or variable) partially applied to its other arguments, leaving just the interface argument. For example, see src/Libraries/Base3-Contexts/ModuleContext.bsv where the type in brackets is ModuleContext#(st1), or bsc.bsv_examples/configbus/CBus.bsv where there type is ModWithCBus#(sa,sd).

I also think there needs to be better coordination between the isModule, findModId, and case of x code: the existence of an IsModule proviso in ps is not enough to conclude that this definition is a module; the type in the IsModule proviso must also be the same mty from the return type. (The proviso could be applying to the type of an argument, or other parameter, for example.) Also, there may not be an IsModule proviso, when the mty is a constructor (like Module or ModuleContext or ModuleCollect). To account for that, it seems that VPrint's isModule calls likeModule to see if the constructor is "module-like", specifically that the name starts with Module -- this is hackish, but I guess it's good enough. If someone defines their own module type that doesn't start with Module, then such module definitions will get printed as functions (not module..endmodule), but the user is presumably an advanced user who will be OK with that. (I guess the worse situation is if someone defines a type starting with Module that isn't a module, but takes an argument and is the return type of a function, then it might get pretty-printed weird. Oh well.)

Anyway, I would restructure p2defs to something like this:

let non_module_output = (pProps d props $+$ ppValueSign d i [] qt cs)
in case x of
     TAp mty ity ->
         let (ismod, ps') = findModId mty ps
         in if ismod || (isModuleTC mty)
         then let
                  ... all the defs for constructing the module output ...
                  ... if mty matches (TVar _) then don't print it, otherwise do print "[mty]" ...
                 in ...
         else non_module_output
     _ -> non_module_output

Note that findModId has a new argument (mty) which it uses to match any IsModule proviso that it finds (and if the type in the proviso doesn't match, it keeps moving). Also note that isModuleTC is just likeModule but adapted to take a type.

Does that seem reasonable?

RyanGlScott commented 10 months ago

Thanks!

The IsModule was being intentionally dropped, though. The following BSV code (without any IsModule proviso):

module [m] mkMod (Ifc);

is equivalent to this BH code:

mkMod :: (IsModule m _) => m Ifc

Is it? I tried this:

module [m] mkMod (Empty);
endmodule

But that fails to compile:

$ bsc Top.bsv
Error: "Top.bsv", line 1, column 12: (T0030)
  The provisos for this expression are too general.
  Given type:
    m#(Empty)
  The following provisos are needed:
    Monad#(m)
      Introduced at the following locations:
    "Top.bsv", line 1, column 1

Granted, you don't need IsModule per se, but you do need some proviso to get this to typecheck.

However, I don't know if there's a good way to check for this -- but it's probably safe for your purposes to leave out [m], since it's unlikely that it's ever referenced.

This gives me pause, since I'm imagining a program like this:

module [m] mkMod#(m#(Empty) mod)(Empty)
  provisos (IsModule#(m, c));
  let e <- mod;
endmodule

If you dropped the [m] part, then it will fail to compile (with or without the provisos) with:

$ bsc Top.bsv
Error: "Top.bsv", line 3, column 12: (T0020)
  Type error at:
    bind

  Expected type:
    function _m__#(Empty) f(m#(Empty) x1, function _m__#(Empty) x2(Empty x1))

  Inferred type:
    function a__#(c__) f(a__#(b__) x1, function a__#(c__) x2(b__ x1))
quark17 commented 10 months ago

This gives me pause, since I'm imagining a program like this:

module [m] mkMod#(m#(Empty) mod)(Empty)
  provisos (IsModule#(m, c));
  let e <- mod;
endmodule

In BSV, you can use the module keyword in a type context to refer to the implicit module variable (m). So the above can be written like this:

module mkMod #(module#(Empty) mod) (Empty);
  let e <- mod;
endmodule

Someone could write it explicitly, as you've shown, and it would be good if the pretty-printer displayed that as correct code. But I'm suggesting that it's not your responsibility to get it perfect, if you want to just get working the part you need. If we have to cut a corner, I think it's ok for the example as you wrote to not parse back in.

My original suggestion was to just not print bare variables (like m); but if we want to be more correct, what about: only print bare variables that aren't referenced elsewhere. That would require looking through the entire body of the module, which might be too much work? But in your example above, it just requires looking in the rest of the type -- and that shouldn't be much work.

However, I can think of a simpler idea: Detect when the module type variable is exactly the one that the BSV parser inserts implicitly, and only don't print in that case. IDs can carry properties (see IdPParserGenerated in Id.hs) and src/comp/Parser/BSV/CVParserCommon.lhs could be updated so that defaultModuleMonadInfo adds the property to the m ID (with addIdProp).

However, this would only improve the printing of BSV code; pretty-printing BH code as BSV would still insert unnecessary variables and provisos. So I would probably favor one of the above approaches, such as to look for whether m occurs elsewhere in the type (or in the body, if you're brave). Or, actually, we could do both: if the ID has the property (which will be the majority of the cases), we can short circuit and just not print it; otherwise, we can do the work of looking in the rest of the type or body.

There's a package CFreeVars.hs that defines utilities for getting the free variables of an expression, so you could use that to get the free vars of the body, and then check if m is among them. It also has functions for getting the variables from a qualified type; but you want to make sure that it doesn't pick up m from the return type or the IsModule proviso, so you would either need to remove them first or apply a function to the pieces -- or it's easy enough to roll your own function for walking types.

Is it? I tried this:

module [m] mkMod (Empty);
endmodule

But that fails to compile

Hm. I will have to think about this. Would it hurt to change the BSV parser to insert an IsModule proviso if the user hasn't specified one, but has specified something in brackets.

Anyway, the rest of my statement still stands, which is that most users write no brackets and no IsModule and the pretty-printing should probably not print them explicitly, if it can help it.

RyanGlScott commented 10 months ago

Sorry for taking so long to respond to this.

But I'm suggesting that it's not your responsibility to get it perfect, if you want to just get working the part you need. If we have to cut a corner, I think it's ok for the example as you wrote to not parse back in.

Fair enough. My use case only concerns code that uses capital-M Modules, so I can live without getting everything perfect. Indeed, it sounds like pretty-printing code like module [m] mkMod#(m#(Empty) mod)(Empty) correctly would require a non-trivial amount of additional complexity, so I'll refrain from doing this for now.

Note that findModId has a new argument (mty) which it uses to match any IsModule proviso that it finds (and if the type in the proviso doesn't match, it keeps moving).

This is an interesting suggestion. Changing the type of findModId reveals two other uses of this function here and there that would also need to be updated. I took a look at the surrounding code, and it strikes me that the surrounding code here and there, respectively, is remarkably similar to the code in p2defs. Like the code in p2defs, the surrounding code also pretty-prints modules, and they both exhibit the same (TAp (TCon _) y) -> y; ... pattern that the code in p2defs does. This suggests that we'd need to refactor this surrounding code in a similar way to what you suggest in https://github.com/B-Lang-org/bsc/pull/666#issuecomment-1897806132. (I'm not entirely sure how to test these code paths yet, however.)