There are many occasions where the halo2 columns contain small values even though we're working with big fields. This is the case of:

selector columns (either fixed, or dynamic selectors)
bit / byte columns (for example, there are many of them in keccak and sha256 circuits
small value columns (for example, columns holding tags)
emulated field columns (may be 64 bit columns in halo2wrong, or in the zkEVM we have a lot of 128bit column because we split Ethereum Words into 2 cells)

If we can have a dynamic type that can hold a vector of field elements specialized for a certain size, we can reduce the memory usage. This has the following benefits:

Lower peak memory. I'm not sure how impactful this will be, because when we go to the extended domain we're back to full field values
Speed increase, due to better caching from lower memory footprint
Possibility of specialized MSM (I'm not sure if it also applies to FFT) for scalars of a particular bit size.
For external GPU operations, transferring the vector for an MSM can be an expensive operation, so reducing the size of the vector can be very beneficial.

On the possibility of specialized MSM, I did an experiment here: https://github.com/privacy-scaling-explorations/halo2curves/blob/4aa0c00f716cfe2f67ee4d421df305516f7c6e66/src/msm.rs#L476 And these are the results for columns that hold 8 bit values. The improvement is significant:

cyclone k=22 ..............................................................236.314ms
older k=22 ................................................................194.241ms
older_small k=22 ..........................................................4.898ms

NOTE1: It would be ideal to repeat these tests after we get https://github.com/privacy-scaling-explorations/halo2/pull/202 to see if the performance improvement is still significant. NOTE2: The experiment uses coefficients in normal form. If they were Montgomery, a transformation to normal form would be needed and the MSM would be slightly slower (but still much faster than the current one).

Realizing this requires changes in the backend and possibly in the frontend (and middleware). Here I describe some ideas

Vector of Prime Fields trait

Introduce a trait (for example called VecPF) for a Vector of PrimeFields that is generic on F: PrimeField. This trait would support indexing and mut indexing operations, and each value should be accessible as a little endian slice of bytes representation. The trait would also include a method or associated constant that tells how many bytes each element occupies.

Then the witness and fixed columns would turn from Vec<Vec<F>> to Vec<dyn VecPF<F>>, so that we can mix columns of different bit sizes.

Now the MSM would receive &dyn VecPF<F> or be generic on VecPF<F>.

Frontend

The first idea that came to my mind is that a circuit developer specifies the bit size of a column. David pointed out a possible problem here: the developer may be confused and believe that this is specifying a range constraint on the column. This would be specially problematic if the assignment method rejects values over the bit size, as it would confirm the developer intuition of the range check. One solution I thought was to instead extend the API and introduce hints. So a developer may hint that a column holds 8 bit values. Then internally the column would start as 8 bits, but upon assignment of bigger values it would switch to 256 bits silently; this way it's clear for the developer that this is not a range constraint. Furthermore we could add some warnings via the MockProver that report this situation (where hints contradict the assignments).

Another option is that at prove time, before calculating MSMs we scan each column to determine the bit width and convert them accordingly.

Other challenges

Blinding factors need to be the full size, so on a smaller bit-width column they will not fit. My proposal for this is to split the MSM in two parts, one that commits the small field values, and then one that continues with the blinding factors.

Addendum

Currently the PrimeField trait has the to_repr().as_ref() way to access the bytes of the element. The current implementation of the MSM assumes the encoding is little endian. Is there any situation were we may need big endian? If not, I think it would be great if we specify that all representations are little endian and finish the ambiguity that we have now. Moreover if we know the internal representation is little endian, we can get a little endian as slice of bytes (without copying) from an array of u64 (as long as the machine is little endian, which is the case for current popular architectures) by doing transmute from &[u64; 4] to &[u8; 32]. Of course this needs forking ff, and we would also resolve https://github.com/zkcrypto/ff/issues/106

Edit: I had a misunderstanding of the internal representation of our prime fields. Since the internal representation is Montgomery, there's no way to get a free transmute to get a slice of the little endian integer, it must be first transformed to normal form.

privacy-scaling-explorations / halo2

Proposal: Specialized MSM and Column vectors for small values #315

Vector of Prime Fields trait

Frontend

Other challenges

Addendum