Allow diffusion to gap character in OADM models?

In the generation code for the OADM models, we are explicitly preventing the sampling of gap characters (e.g https://github.com/microsoft/evodiff/blob/main/evodiff/generate_msa.py#L211 )

Is this necessary, or an explicit choice to only sample fixed length sequences given a sub-sampled MSA. Given that MASK and GAP are distinct tokens, and the ground truth alignment of the query sequence can contain gap tokens, it seems allowing diffusion to GAP would be desirable.

I noticed in the conditional MSA generation this check isn't present https://github.com/microsoft/evodiff/blob/main/evodiff/conditional_generation_msa.py#L605C1-L605C102 (despite the comment's claim), would like to understand better what the difference here is.

microsoft / evodiff

Allow diffusion to gap character in OADM models? #22