drom / LEB128

Little Endian Base 128 converters
MIT License
8 stars 3 forks source link

unpack interface #3

Open drom opened 7 years ago

drom commented 7 years ago

@piranna I was thinking about useful Verilog interface for the unpack modules.

Most probably the byteStream would be located in some sort of memory with word access. Practical word width WDI can be 4, 8, 16, 32 Bytes.

The stream would have some sort of initial pointer ai of WA width 2, 3, 4, 5 will point to the first byte in the LEB encoded number.

The module would produce single word of fixed width WDO and the updated pointer ao.

Here is proposed Verilog interface:

module unpack #(
    parameter NDI = 8, // 8 Byte input bus
    parameter WA  = 3, // 3 bit pointer
    parameter NDO = 4  // 4 Byte output bus
)(
    input  [8 * NDI - 1 : 0] di, // input stream data bus
    input  [WA - 1 : 0]      ai, // input pointer
    output [8 * NDO - 1 : 0] do, // unpacked data
    output [WA - 1 : 0]      ao, // updated pointer
    output                gluon  // one more word is needed to finish unpack
);

exampe 1 (unpack_36_64_u32):

// cycle 1
di = 64'h xx_xx_26_8E_E5_xx_xx_xx;
ai = 3'b011
...
do = 32'h 00098765;
ao = 3'b101;
gluon = 0;

exampe 2 (unpack_36_64_u32):

two cycle unpack operation because of missalighnment.

// cycle 1
di = 64'h 8E_E5_xx_xx_xx_xx_xx_xx;
ai = 3'b110
...
do = 32'h 00000765;
ao = 3'b000;
gluon = 1;

// cycle 2
di = 64'h xx_xx_xx_xx_xx_xx_xx_26;
ai = 3'b000
...
do = 32'h 00098765;
ao = 3'b001;
gluon = 0;
piranna commented 7 years ago

This would force to use it in several cycles... I don't like it, this would earn some transistors but I find it better to use one decoder the width of the biggest one you are going to receive and decode it in just one cycle. I find yours useful, but it's complicating the interface, so maybe it would be a good idea to have in the project a collection of decoders instead?

drom commented 7 years ago

What is your use case? How do you get your input stream? How do alight data before unpacking?

drom commented 7 years ago

actually if you tie ai pointer to 0, and make input data bus di wider then output do, then you can guarantee that you will have 1 cycle execution. gluon in this case is the error (or carry) flag, that you have more data outside of input window. That reminds me that we may need another flag overflow that indicates, that input stream data sequence is longer then fits into output word format.

piranna commented 7 years ago

What is your use case?

As I told you, I'm doing an implementation of WebAssembly on FPGA, so I know in advance varint numbers will not be greater than 5 or 9 bytes for 32 and 64 bits.

How do you get your input stream?

I don't have an input stream... just read the bytes from a ROM and wire them directly to the decoder.

How do alight data before unpacking?

My ROM implementation does that for me :-P