mbuesch / crcgen

Generator for CRC HDL code (VHDL, Verilog, MyHDL)
https://bues.ch/h/crcgen
GNU General Public License v2.0
28 stars 8 forks source link

wider data input - feature request #1

Closed s3bs closed 3 years ago

s3bs commented 4 years ago

found this page via https://bues.ch/cms/hacking/crcgen

seems like all the generated code always provides a 8bit data input and N bit crc input.

It would be super useful to be able to also specify the width of the data input as varible. A lot of times using verilog you'll process 32 bits at a time or even 64 bits at a time.

mbuesch commented 4 years ago

Thanks for your feature request.

Yes indeed. Crcgen currently does not generate code for multi-byte generators. It would be possible to add that.

However you can chain the generated 8 bit Verilog function in your user code to get combinatorial multi-byte crc generation. Or you can write a wrapper function that does exactly that.

paul-demo commented 3 years ago

A very helpful website @mbuesch ! I found your CRC generator to work while the one over at outputlogic.com gave incorrect results (probably due to some user error, I just can't figure it out). For @s3bs, here's how to chain it to get wider words (verified using 5G-NR CRC codes against Matlab's nrCRCEncode function).

I do however want to ask @mbuesch if you can support below 8b CRCs while retaining 8b data words. This is because 5G/NR (and I think 4G/LTE) specify a CRC-6 and your generator only goes down to 8b width unfortunately. I would use outputlogic.com's generator but after many frustrating hours I can't figure out how to get it to produce the same result as the Matlab reference implementation.

I stripped down this example from my highly parameterized code to make it more tractable. The "_impl" cells are just what @mbuesch's code spits out. It's untested in this form, but should give you the idea of how to chain the 8b CRCs together to make wider ones. This implementation also allows arbitrary length (as long as a multiple of one byte) due to the tkeep at the input. The revbitorder cell just reverses the bits in a word; this may be skipped depending on how you are interpreting your input data, but in my case, the 32b input words are assumed to be {[a31, a30, ..., a1, a0], [a63, a62, ..., a33, a32], ...} on the input AXI stream.

module crc_5g (

  input logic clk,
  input logic reset,

  // Input
  input logic [3:0][7:0] din_tdata,
  input logic [3:0] din_tkeep,
  input logic din_tlast,
  input logic din_tvalid,

  // Output
  output logic [23:0] crc_tdata,
  output logic crc_tvalid
  );

  logic [3:0][23:0] crcIn, crcOut, crcOut_rev;
  logic [23:0] crc_d = '0;
  assign crcIn[0] = crc_d;

  generate
    genvar i;
    // Connect individual 8b-wide instances
    for (i = 1; i < 4; i++) begin
      assign crcIn[i] = crcOut[i-1];
    end

    for (i = 0; i < 4; i++) begin
      revbitorder #(.WIDTH(24)) crcOut_reverser (.in(crcOut[i]), .out(crcOut_rev[i]));
      crc_5g24a_impl crc_5g24_impl_inst (
        .crcIn(crcIn[i]),
        .data(din_tdata[i]),
        .crcOut(crcOut[i])
      );
    end
  endgenerate

  // Keep track of state
  always_ff @(posedge clk) begin
    if (reset | (din_tvalid & din_tlast))
      crc_d <= '0;
    else if (din_tvalid)
      crc_d <= crcOut[3];
  end

  // Drive output in final beat (otherwise zero)
  always_comb begin
    if (din_tvalid & din_tlast & ~reset) begin
      crc_tvalid = '1;
      case (din_tkeep)
        1: begin crc_tdata = crcOut_rev[0]; end
        3: begin crc_tdata = crcOut_rev[1]; end
        7: begin crc_tdata = crcOut_rev[2]; end
        default: begin crc_tdata = crcOut_rev[3]; end
      endcase
    end else begin
      crc_tvalid = '0;
      crc_tdata = '0;
    end
  end

endmodule

EDIT: I ended up removing the revbitorder in my subsequent version, just as a note to anyone using this as a reference.

mbuesch commented 3 years ago

Thanks @paul-demo for the example code. Chaining is a good approach to process multiple bytes at once.

About the 6-bit CRC: It certainly is possible to modify the generator to support less than 8 bits CRC. I think it currently has some hard coded assumptions built in that currently require 8 bits minimum. But that could be fixed. The generator logic itself actually is quite simple (although I admit it might look complex). It basically works like this: The generator runs the common bitwise 8-bit loop and collects all XOR terms. It then chains all those terms. And finally it removes all redundant terms that would cancel out each other. Therefore the generator is nothing more than the simple bitwise CRC implementation that we all know. With just one minor modification: It replaces the usual if-condition with XOR logic.

So, do you have a reference implementation of the 6-bit CRC at hand? It would be nice to have a bitwise implementation (e.g. in C or any other language) and also maybe an optimized (e.g. table based) implementation for verification.

mbuesch commented 3 years ago

I started by adding a branch https://github.com/mbuesch/crcgen/tree/flexiblesizes to experiment with flexible data word (!= 8) and crc word (<8) sizes. For first commit just extends the reference implementation used for testing the generator to work with data words != 8 bits. I think that the reference implementation should now work with 6-bit CRC. However, I didn't verify that against a real world 6-bit CRC implementation, yet.

paul-demo commented 3 years ago

Thanks! I will try out the <8 CRC size one. I actually prefer the byte-size-data restriction since that works nicely with AXI RTL designs which are byte-addressable (ex: using tkeep in AXI4-S or tstrb in AXI4). It isn't clear how to "short circuit" the final transfer if you're using a wider implementation like 32b, so I think that chaining byte-size blocks method I showed above is required if your data may be non-word-sized, ie if it has some straggler bytes.

For FPGA/ASIC I don't think it is necessary to optimize the number of terms in the expressions by building an optimized table, because the synthesis tool will mangle the terms and may split / combine them however it likes.

For verification, I have been comparing against nrCRCEncode with a variety of data lengths that are a multiple of a single byte. I will pull that flexiblesizes branch and verify that the CRC-6 (3GPP / ETSI 38.212) (x^6 + x^5 + 1) matches Matlab's implementation in my 32b wrapper above. I'm using Verilator for unit tests, so it's pretty quick to do that.

I'm sure there are a thousand other reference implementations, but as is apparent on Wikipedia, every CRC implementation and paper author has made their own assumptions for 50 years and so there is not a single standard convention to compare against. The 3GPP one makes the most sense to me and is basically this definition:

gf([a0, a1, ..., a_A-1, p0, p1, ... p_P-1]) / gf([x0, x1, ..., x_P-1]) -> gives a remainder of zero when parity bits {p} are computed correctly for data bits {a} and CRC polynomial {x}. In the case of Matlab's crcNREncode function, it is just a wrapper for that Galois division that picks the polynomials in the 3GPP standard, and pads with zero bits [a0, a1, ..., a_A-1, 0, ..., 0] to perform the division (24b of padding in the case of a 24th-order polynomial x^24 + x^23 + ... + 1). I'm sure this is a common convention, but I've found that it is not universal.

mbuesch commented 3 years ago

Yes, sure. The optimization is not strictly necessary. The synthesized result will probably be similar or even identical. However it makes the generated source code much nicer.

The code generator currently does not support CRC < 8. I just changed the reference implementation, which is used for unit testing the code generator. So there's currently no need for you to test anything.

But I'll see if I can also extend the code generator to support CRC < 8 and data != 8. I'll keep you updated.

paul-demo commented 3 years ago

Ah, I'm seeing that now. Looks like generator.CrcGen.__gen() would have to be significantly rewritten.

mbuesch commented 3 years ago

Ok, so I rewrote the generator and pushed it to the branch https://github.com/mbuesch/crcgen/tree/flexiblesizes . The generator should now support arbitrary CRC bit lengths and arbitrary input data word lengths.

Please feel free to test it. If you think this is fine, I'll push this to the master branch and also deploy it to the online generator on the website.

paul-demo commented 3 years ago

Thanks so much, that was an impressive turnaround time. I have confirmed that it works as expected for 8b data, 6b crc width in the context of my usage. I tested using x^6 + x^5 + 1 and all data lengths 1..8 bytes using my 32-bit wrapper of the 8b calculation.

mbuesch commented 3 years ago

Cool. Thanks for testing. I'll get it released asap.

mbuesch commented 3 years ago

crcgen-2.0 has been released. All features discussed in this issue have been implemented.