clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.44k stars 151 forks source link

Readability VHDL #139

Open flip111 opened 8 years ago

flip111 commented 8 years ago

I have my first look on the clash compiler and about the readibility of the vhdl i'm raising some issues. I would like to know whether these issue are:

  1. a shared concern for other developers
  2. whether it could be improved within the clash compiler or if it's better to do it with a seperate program
  3. if it can be improved in the clash compiler, where can i find the logic that's relevant for the issue.
  4. maybe there is a solution for this already?

The vhdl code that the clash-compiler outputs is very different from the way a human would write it. There are some reasons why having the vhdl code more human-like could be important.

  1. It's very very hard to figure out what part of the vhdl code maps back to what in clash. On component name it's not that hard because some names were preserved. Mainly when it comes down to the gates.
  2. Some simulators can debug the vhdl code (set breakpoints, continue). But because of point 1 that's not gonna help you much fixing the clash code.
  3. VHDL code can not be verified by reading the source, only by wave of test setup.
  4. VHDL code can not be modified because it's just too hard to understand and starting a component from scratch would be faster.
  5. Because of previous mentioned points it will be nearly impossible to work together with people who just know VHDL, and can't or don't want or are not able to get into clash/haskell. Okay clash generated components can interface with "regular"-VHDL components. But still both components will be part of one system and sometimes it's required for someone else to look at your part.
  6. Having one or a few people who can read and write clash in a team of more people who write VHDL increases the bus-factor. If the people who know clash leave then the vhdl-only developers can not continue with the generated vhdl-code.

Below are some of the things i found that might make the VHDL more readable for other people. Not every issue here is "causing" the issues mentioned above. For example formatting (code beautifying) has nothing to do with the actual logic of the code, but would still help to make things readable.

(most of the code here is taken from the blink example)

  1. Formatting / pretty printer / code beautifier. For example make all vhdl keywords capital. On first sight i would suspect this would be a good candidate to place in a separate tool. But on second thought ... some information about the original clash code would sure come in handy. A (totally) seperate tool would not have access to the information the compiler has.
  2. Simple transformations of source code. For example clash generates std_logic_vector(0 downto 0) and i would just expect std_logic in this case. I suspect because the compiler was written in such a way that it always tries to access bits on an index, so even for a single bit it prefers to do (0).
  3. While that might be easy for compiler consistency to have the same "style" applied everywhere it makes for a lot of noise in the source code. Below is a part of an overloaded function toSLV

    function toSLV (slv : in std_logic_vector) return std_logic_vector is begin return slv; end;

  4. "value inference" (for i can not think of a better term to express this). In the blink example a counter runs up to the value 16650000 there are 24bits needed to express this value. (max 24 bits = 16777216). However since the supplied value was 0 without any further type qualifier, type Int was chosen by default, which is 64bits on my system so i end up with std_logic_vector(63 downto 0). I'm not even sure if synthesis can optimize that away, in that case it would also hurt performance (in terms of power and size)
  5. About that same 64bits vector .. it's actually declared as signed. A human would use unsigned when counting from 0 to 16650000.
  6. Just .. crazy logic, below is the combinatoric circuit from the blinker. And below that how someone would have to read that.
  altLet_0 <= std_logic_vector(rotate_left(unsigned(ww_i1),to_integer(to_signed(1,64))));

  altLet_1 <= not ww_i1;

  tmp_tte_rhs_16 <= wild_4;

  tmp_tte_16 <= false when tmp_tte_rhs_16 = 0 else true;

  subjLet_2 <= tmp_tte_16;

  altLet_3 <= altLet_1 when ww1_i2 else
              altLet_0;

  tmp_17 <= to_signed(1,64) when ww2_i3 = to_signed(0,64) else to_signed(0,64);

  wild_4 <= tmp_17;

  bodyVar_5 <= altLet_3 when subjLet_2 else
               ww_i1;

  altLet_6 <= not ww1_i2;

  repANF_7 <= altLet_6 when w_i4 else
              ww1_i2;

  tmp_tte_rhs_19 <= wild_10;

  tmp_tte_19 <= false when tmp_tte_rhs_19 = 0 else true;

  subjLet_8 <= tmp_tte_19;

  altLet_9 <= ww2_i3 + to_signed(1,64);

  tmp_21 <= to_signed(1,64) when ww2_i3 = to_signed(16650000,64) else to_signed(0,64);

  wild_10 <= tmp_21;

  bodyVar_11 <= to_signed(0,64) when subjLet_8 else
                altLet_9;

  repANF_12 <= (product1_sel0 => bodyVar_5
               ,product1_sel1 => repANF_7
               ,product1_sel2 => bodyVar_11);

  topLet_o <= (product0_sel0 => repANF_12
              ,product0_sel1 => ww_i1);
altLet_0 used once, remove. modify statement for `altLet_3`:

  IF ww1_i2 THEN
    altLet_3 <= altLet_1;
  ELSE
    altLet_3 <= std_logic_vector(rotate_left(unsigned(ww_i1),to_integer(to_signed(1,64))));
  END IF;

altLet_1 used once, remove, modify statement for `altLet_3`:

  IF ww1_i2 THEN
    altLet_3 <= not ww_i1;
  ELSE
    altLet_3 <= std_logic_vector(rotate_left(unsigned(ww_i1),to_integer(to_signed(1,64))));
  END IF;

tmp_tte_rhs_16 used once, remove, modify statement for `tmp_tte_16`:

  tmp_tte_16 <= false when wild_4 = 0 else true;

If you keep doing that eventually you will end up with something like this:

  -- perform change
  IF cntr = to_signed(0,64) THEN
    IF mode THEN
      bodyVar_5 <= not leds;
    ELSE
      bodyVar_5 <= std_logic_vector(rotate_left(unsigned(leds),to_integer(to_signed(1,64))));
    END IF;
  ELSE
    bodyVar_5 <= leds;
  END IF;

  -- remember state of keypress to change mode
  IF key1R THEN
    repANF_7 <= NOT mode;
  ELSE
    repANF_7 <= mode;
  END IF;

  -- increment counter
  IF cntr = to_signed(16650000,64) THEN
    bodyVar_11 <= to_signed(0,64);
  ELSE
    bodyVar_11 <= cntr + to_signed(1,64);
  END IF;

  -- output record for the mealy statemachine
  repANF_12 <= (product1_sel0 => bodyVar_5
               ,product1_sel1 => repANF_7
               ,product1_sel2 => bodyVar_11);

  topLet_o <= (product0_sel0 => repANF_12
              ,product0_sel1 => leds);

Now usually the statemachine would also be described in the same component. Then the blinker would probably only have to output the leds and not a record type anymore.

christiaanb commented 8 years ago

128 is related.

Thanks for taking the time for writing this up, I'll take a look at it tomorrow and respond in depth.

Two things:

flip111 commented 8 years ago

I'm on clash version 0.6.8.

I left out the surrounding code like process. I was thinking having a combinatoric process that doesn't respond to clk. I think you can also have IF statements outside of processes. These would then run as concurrent statements. I couldn't find a really good reference on the IF statement. However this page says Whats New in '93 In VHDL-93, the if may have an optional label. The version before that was The initial version of VHDL, designed to IEEE standard IEEE 1076-1987. Concurrent statements have some limitations on reusing the signal, i found this answer explaining it well.

christiaanb commented 8 years ago

I'll answer your points in order:

Issues

1

a shared concern for other developers

Yes, I care, and as you can see in #128, also other care about what the generated VHDL looks. In version 0.6.11 I added changes that tried to generate fewer intermediate variables, and also try to preserve more Haskell names in the VHDL output.

2

whether it could be improved within the clash compiler or if it's better to do it with a seperate program

Well, as 0.6.11 shows, some parts can certainly be improved within the CLaSH compiler.

3

if it can be improved in the clash compiler, where can i find the logic that's relevant for the issue.

So that depends on what you want to solve, for example:

For example make all vhdl keywords capital.

Has to be changed in both the VHDL code-gen: https://github.com/clash-lang/clash-compiler/blob/master/clash-vhdl/src/CLaSH/Backend/VHDL.hs But also in all the primitive/blackbox definitions: https://github.com/clash-lang/clash-compiler/tree/master/clash-vhdl/primitives

I'll try to point out other places later on in my response

4

maybe there is a solution for this already?

There aren't any implemented solutions that I know of, then again, I never really looked for them. Many of the issues you raised can be solved, but it requires engineering effort. If people want the current situation to be improved, then they need to help out by providing patches, as I cannot do everything myself.

More human like

1

It's very very hard to figure out what part of the vhdl code maps back to what in clash. On component name it's not that hard because some names were preserved. Mainly when it comes down to the gates.

OK, so in 0.6.11 I tried to preserve Haskell names in the generated VHDL to make it easier to map back generated VHDL to the original Haskell code. Now, some names are lost due to the GHC Haskell compiler on which CLaSH is based. However, I created a patch for GHC, that will be included in version 8.0, that preserves more names. Once GHC 8 is released, I can base CLaSH on top of GHC 8, and hence have even more Haskell names preserved.

2

Some simulators can debug the vhdl code (set breakpoints, continue). But because of point 1 that's not gonna help you much fixing the clash code.

Well, normally debugging VHDL would only be needed when CLaSH generates incorrect VHDL... which would be a bug in the CLaSH compiler. So you shouldn't ever need to do that as a normal CLaSH user only as a CLaSH developer to locate/fix bugs.

3

VHDL code can not be verified by reading the source, only by wave of test setup.

Again, this is only a problem if the CLaSH compiler is buggy... which it shouldn't be. If CLaSH generates incorrect VHDL, then being able to read the generated VHDL is not going to be of much help.

4 - 6

Yes, those are all valid points which make it somewhat hard to gradually introduce CLaSH in a project. The CLaSH compiler originally treated VHDL as a C-compiler treats assembly: not to be looked at by the avarage user. Also, all the parts of Haskell that don't map straightforwardly to VHDL, such as pattern matching, will remain odd-looking in the generated VHDL. So I don't think we'll ever end up in an ideal situation.

Making things more readable

1

Formatting / pretty printer / code beautifier. For example make all vhdl keywords capital. On first sight i would suspect this would be a good candidate to place in a separate tool. But on second thought ... some information about the original clash code would sure come in handy. A (totally) seperate tool would not have access to the information the compiler has.

I already mentioned earlier where we could change the capitalisation of words. Now, in terms of layout: we need to update our primitive/blackbox template system to be column-position preserving. What I mean by that is, that if we splice in a multi-line piece of VHDL into a template, everything from the 2nd line on shouldn't be flushed to the left as it happens now. This can probably be fixed somewhere in: https://github.com/clash-lang/clash-compiler/blob/master/clash-lib/src/CLaSH/Netlist/BlackBox/Util.hs#L190

2

Simple transformations of source code. For example clash generates std_logic_vector(0 downto 0) and i would just expect std_logic in this case. I suspect because the compiler was written in such a way that it always tries to access bits on an index, so even for a single bit it prefers to do (0).

This stems from the fact that Bit is specified as:

type Bit = BitVector 1

So that all the operations working on BitVector also work for Bit. And so the reason we use std_logic_vector(0 downto 0) is indeed so that all the primitives/blackboxes defined in https://github.com/clash-lang/clash-compiler/blob/master/clash-vhdl/primitives/CLaSH.Sized.Internal.BitVector.json only have to be specified to work on std_logic_vector. If we translated Bit to std_logic, we would have to update all those primitives to work on both std_logic_vector and std_logic.

3

While that might be easy for compiler consistency to have the same "style" applied everywhere it makes for a lot of noise in the source code. Below is a part of an overloaded function toSLV function toSLV (slv : in std_logic_vector) return std_logic_vector is begin return slv; end;

Yeah, that is just laziness on my part I guess, and could be avoided. Code like that is generated because toSLV for vector and product types call toSLV for their elements, but the code-generator doesn't check if the elements of product and vector types are already bit-vector. The code for fixing this could be found in: https://github.com/clash-lang/clash-compiler/blob/master/clash-vhdl/src/CLaSH/Backend/VHDL.hs#L306 and https://github.com/clash-lang/clash-compiler/blob/master/clash-vhdl/src/CLaSH/Backend/VHDL.hs#L325

4

"value inference" (for i can not think of a better term to express this). In the blink example a counter runs up to the value 16650000 there are 24bits needed to express this value. (max 24 bits = 16777216). However since the supplied value was 0 without any further type qualifier, type Int was chosen by default, which is 64bits on my system so i end up with std_logic_vector(63 downto 0). I'm not even sure if synthesis can optimize that away, in that case it would also hurt performance (in terms of power and size)

Yeah... this is a very hard problem. So the thing is, is that numbers default to Integer if there are no other constraints to force them to be another type. And indeed, Integer is 64-bit. Now, trying to figure out that we will never need more than 24-bits for the counter, because that is its cut-off point, is extremely hard to do. So yes, I can see that 16650000 needs only 24-bits, but I also see somewhere x' = x + 1, and it's very hard to figure out that x' wont become larger than 24-bits.

Now, in the blinker example, what you could do to solve this as a uses, is to specify:

cnt_max = 16650000 :: Unsigned 24

And all that type information will propagate. In conclusion, I don't think I can solve the value inference problem.

5

About that same 64bits vector .. it's actually declared as signed. A human would use unsigned when counting from 0 to 16650000.

As I said, unconstrained numbers default to Integer, which is signed.

6

Just .. crazy logic, below is the combinatoric circuit from the blinker. And below that how someone would have to read that.

Ok, some things from that I've solved since 0.6.11, other I havent. One of the problems is for example that Haskell has case-expression (i.e. they can occur to the right of the =), where VHDL only hase case/if/select-statements. It is one of the reasons why case-statements are pulled apart as they are in the generated code. The next thing is, is that Haskell where-clauses and let-expression map to concurrent assignments in VHDL, and not to processes.

Now, maybe some parts could be transformed to processes, but the problem then are the primitives/blackboxes: they expect to be spliced into the concurrent statements world, not the process world.

I guess some parts can be solved as a pre-processing step in the VHDL code-gen, where we consolidate pulled-apart conditional assignments into a single process. This probably also means extending our Netlist type https://github.com/clash-lang/clash-compiler/blob/master/clash-lib/src/CLaSH/Netlist/Types.hs#L108 to allow for nested case/if-statements.

To comment on:

Now usually the statemachine would also be described in the same component. Then the blinker would probably only have to output the leds and not a record type anymore.

Yes, it has been on my todo list to inline higher-order functions such as mealy, instead of specialising them. In that case, the state-machine and the registers would end up in the same component. Currently I prefer specialising, because it allows caching (i.e. when a higher-order function is given the same argument twice, I only have to do the translation once). I have to figure out how I can also do caching for when I inline higher-order functions.

Finally, just to show, this is the VHDL being generated, starting version 0.6.11 for blinkerT:

-- Automatically generated VHDL-93
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use IEEE.MATH_REAL.ALL;
use std.textio.all;
use work.all;
use work.blinker_types.all;

entity blinker_blinkert is
  port(ww     : in std_logic_vector(7 downto 0);
       ww1    : in boolean;
       ww2    : in signed(63 downto 0);
       w      : in boolean;
       result : out blinker_types.tup2);
end;

architecture structural of blinker_blinkert is
  signal case_alt     : std_logic_vector(7 downto 0);
  signal case_alt_0   : std_logic_vector(7 downto 0);
  signal case_scrut   : boolean;
  signal case_alt_1   : std_logic_vector(7 downto 0);
  signal wild         : signed(63 downto 0);
  signal result_0     : std_logic_vector(7 downto 0);
  signal case_alt_2   : boolean;
  signal app_arg      : boolean;
  signal case_scrut_0 : boolean;
  signal case_alt_3   : signed(63 downto 0);
  signal wild_0       : signed(63 downto 0);
  signal result_1     : signed(63 downto 0);
  signal app_arg_0    : blinker_types.tup3;
begin
  case_alt <= std_logic_vector(rotate_left(unsigned(ww),to_integer(to_signed(1,64))));

  case_alt_0 <= not ww;

  case_scrut <= tagToEnum(wild);

  case_alt_1 <= case_alt_0 when ww1 else
                case_alt;

  wild <= to_signed(1,64) when ww2 = to_signed(0,64) else to_signed(0,64);

  result_0 <= case_alt_1 when case_scrut else
              ww;

  case_alt_2 <= not ww1;

  app_arg <= case_alt_2 when w else
             ww1;

  case_scrut_0 <= tagToEnum(wild_0);

  case_alt_3 <= ww2 + to_signed(1,64);

  wild_0 <= to_signed(1,64) when ww2 = to_signed(16650000,64) else to_signed(0,64);

  result_1 <= to_signed(0,64) when case_scrut_0 else
              case_alt_3;

  app_arg_0 <= (tup3_sel0 => result_0
               ,tup3_sel1 => app_arg
               ,tup3_sel2 => result_1);

  result <= (tup2_sel0 => app_arg_0
            ,tup2_sel1 => ww);
end;
flip111 commented 8 years ago

Thanks for your reply

If people want the current situation to be improved, then they need to help out by providing patches, as I cannot do everything myself.

And it's possible with OSS :) I'm a beginner in both VHDL and haskell. I'll start browsing around in the clash code first before thinking of what to do.

In conclusion, I don't think I can solve the value inference problem.

Is it possible to warn for values that were automatically infered as type Integer?

I guess some parts can be solved as a pre-processing step in the VHDL code-gen, where we consolidate pulled-apart conditional assignments into a single process.

What does the compiler pipeline look like? With what you said i think it looks like this:

Templates as string (primitives) -> parse into VHDL AST (using clash source code after it has been optimized) -> pre-processing can happen here -> clash optimizations (on vhdl stage) -> flatten AST to string -> write to file

Maybe it would also be interesting to do post-processing on the VHDL AST after the clash optimizations have run?


I am i correct to say that you would prefer to keep things in the compiler rather than a seperate program? Would you accept compiler flags for formatting, inlining and that sort of stuff?

christiaanb commented 8 years ago

Is it possible to warn for values that were automatically infered as type Integer?

Yes, by using the -fwarn-type-defaults flag

What does the compiler pipeline look like?

There is not VHDL AST, there are only the Netlist data types:

These either get pretty-printed as VHDL, Verilog, or SystemVerilog. The primitives are fully filled in string templates and exist in the Netlist type: https://github.com/clash-lang/clash-compiler/blob/master/clash-lib/src/CLaSH/Netlist/Types.hs#L151

Also, the string templates are not parsed into a VHDL AST, they are just string templates.

Anyhow, I think we can perform the Netlist optimisation just before https://github.com/clash-lang/clash-compiler/blob/master/clash-lib/src/CLaSH/Netlist.hs#L174, where we return the component.

And yes, I would like to keep things in the compiler. With regards to needing a flag for the formatting, inlining, etc. As long as the synthesis results of the generated code do not degrade (i.e. chip surface/luts used increases, clock speed decreases) I don't think there is a need for a flag.

Ericson2314 commented 8 years ago

Templates as string (primitives) -> parse into VHDL AST

I'd like to go to backend AST rather than just straight pretty printing. Now a long time ago, CLaSH did just that, but Christiaan decided it was overkill/overly complex. I see that for CLaSH alone, but if these AST pkgs were something that CLaSH + other projects could use, it might start paying off.

I started doing some work here https://github.com/Ericson2314/verilog https://github.com/Ericson2314/wl-pprint-text-class https://github.com/Ericson2314/clash-compiler/tree/ast when working on a very early version of the verilog back end. https://github.com/christiaanb/vhdl is from the old attempt to do the same with VHDL.

If you would be interested in trying to incorporate those packages for backend ASTs, @flip111, I think it would be a comparatively easy way to get started hacking on CLaSH, and I'd be willing to help / clean up my old attempt :).

flip111 commented 8 years ago

@Ericson2314

I do like to think big and help the discussion along in which directions things could go (and should go?). However as i mentioned before my skillset is fairly limited at the moment. And on top of that some things are not really trivial. But even the more trivial things are hard when not fluent in haskell (for example i never even touched Lens).

Having said that .. i do really like to idea of AST. I've worked with some projects that do code generation, most of them used strings as well. The thing with strings is that it gets you going rather quickly, but you loose the flexibility to transform things later on. AST can help a lot if you want to provide hooks, so things can be made optional, either as flag or as plugin. Usually this is done using some kind of visitor pattern (in OO-talk anyway). Also if you stay with strings and you want to do multiple transformation passes then you would have to manually introduce them in the function that generates the string, so there is no separation of concerns in that case.

Since clash has different backends it would be most rewarding to do any kind of improvement in the clash-AST (or whatever you want to call the layer on the netlist/component/declaration/expression). Though some things would have to be expressed on a per-backend level since they are specific for the target language (VHDL/verilog).

At my job we use a formatter called fmvt (or fvmt, don't know exactly at the moment). It looks pretty old and it mangles some code and comments in a way i'm not happy with. A lot of people just don't use it it all and end up spending a fair amount of time just getting the aligment right. Let it be clear that this does nothing to the functionality and is purely to make code easier to read and to maintain.

So i got to the idea of writing a new formatter (probably using attoparsec). This would help me personally by better understanding both VHDL and haskell (as a learning experience). I would probably start with this before even touching some of the clash code. Diving into a big codebase is an extra difficulty. It's also very rewarding to learn and make a tool that can be used to fix an immediate problem.

Anyway later on i could share my experience with the formatter. Which i probably could extend to change code constructs as well (for example replace "when-else" with "if endif" statements). So then the formatter would have two parts:

  1. Transformations/formatting which can be done only at the VHDL level. Possibly the formatter can be merged into the clash code, distributed with the clash code or just "recommended". Or not even any of those 3, but still good to avoid duplicate code that does the same thing twice.
  2. Transformations which could be done on the level of the clash-AST. Now the formatter can serve as a blueprint of how to do those transformations.

It can help to analyze whether it would really pay of to be using vhdl-AST, or if prettyprinting is sufficient.

@christiaanb

As long as the synthesis results of the generated code do not degrade (i.e. chip surface/luts used increases, clock speed decreases) I don't think there is a need for a flag.

Do you do some kind of regression testing on a (fixed) set of tests to compare compiler versions?

Ericson2314 commented 8 years ago

@flip111 The various backend-agnostic CLaSH IR's are quite nice and I don't see a need to mess with them. It might be possible to do some basic concision optimizations on them, but I am not sure what is to be gained there either.

I agree (atto)parsec is a great place to start---you could fork the VHDL and Verilog repos I linked and work on Parsers/pretty-printers for both of them (while filling in gaps in the AST, there's a lot!). I'm 90% sure neither of those packages use lens or anything too fancy, and the pretty-printing infrastructure is even easier than parsec. In fact you could give each package a demo executable which does just what you say, parse---cleanup---pretty-print. (Cabal allows packages to be 0-1 libraries + n executables.)

If you do that, that will easily boost your Haskell skills, without actually diving into CLaSH itself. Only afterwords---once you are all warmed up---would you need to worry about integrating everything to avoid pretty-print-parsing round trips.

Ericson2314 commented 8 years ago

Oh another neat thing is I can imagine is some sort of "inline assembly" thing were along side functions to be compiled by CLaSH, one can write expressions of one of the backend AST types, which are instead evaluated by CLaSH and spliced in with the other results of compilation.

Any attoparsec parser can also be used by TH allow one to just write normal VHDL/Verilog which is then parsed and turned into such an expression. This could replace the black-boxes jsons perhaps.

christiaanb commented 8 years ago

@flip111 I know there's a big difference between starting a big project or jumping into a big project, but Matthijs and I made the first CLaSH compiler prototype having completed only a single functional programming course (which didn't even use Haskell) 3 years before we started working on CLaSH. So don't worry about how much Haskell you do or do not know.

Now, with regards to parsing, I'm not sure if attoparsec is the right choice. I don't think VHDL has the nicests grammar to work with a parser combinator tool that was meant for speed. I've never build a VHDL parser myself, but I know a friend of mine did (in Java though) and it took him a long time. So investigate more into the different parser options in Haskell before investing too much time into a parser based on attoparsec.

And finally, with regards to regression testing for synthesis results.... no, there are no such regression tests. I sometimes just pick a few circuits that are out there:

@Ericson2314 Incidentally, we have student at our group who is working on haskell-verilog co-simulation for his master's thesis project. Where the ultimate goal would be to have something like http://hackage.haskell.org/package/inline-c, but for Verilog

Ericson2314 commented 8 years ago

@christiaanb Ah excellent! I was thinking of inline-c or ghcjsi when I wrote that comment.