google-deepmind / language_modeling_is_compression

Apache License 2.0
101 stars 14 forks source link

Question about the conversion into byte #12

Closed adnan1306 closed 6 months ago

adnan1306 commented 7 months ago

Dear authors,

I really liked your work and was trying to understand the implementation. I have a quick question: why do we convert the image into byte string using .tobytes()( in https://github.com/google-deepmind/language_modeling_is_compression/blob/a93a5d1679055c7f1101fa1b2db5bbe20d6169bb/data_loaders.py#L112) and convert it back to integer before encoding (in https://github.com/google-deepmind/language_modeling_is_compression/blob/a93a5d1679055c7f1101fa1b2db5bbe20d6169bb/compressors/language_model.py#L86? )

fazega commented 7 months ago

We convert bytes to integers as this is what we must pass the transformer model. Each integer is converted into an embedding, which is then passed to attention+MLP blocks. This is standard transformer decoder-only training. Can you be more specific if I misunderstood your question?

adnan1306 commented 6 months ago

Yeah, makes sense! Thanks for the prompt reply!