apple / ml-cvnets

CVNets: A library for training computer vision networks
https://apple.github.io/ml-cvnets
Other
1.76k stars 225 forks source link

Questions about the file bytes length of ByteFormer #100

Closed DanJun6737 closed 10 months ago

DanJun6737 commented 10 months ago

Hello!

I read the ByteFormer paper and it's really an amazing piece of work! And I read the code about the encoding of file bytes: https://github.com/apple/ml-cvnets/blob/main/data/transforms/image_bytes.py We apply the code to our own datasets. However, we found that when using "png" or "jpeg" encoding mode, the file bytes length of each sample is not consistent. How to solve this problem of file bytes length change?

Thanks a lot!

DanJun6737 commented 10 months ago

@mchorton

mchorton commented 10 months ago

Hi @DanJun6737,

For JPEG: Our collate functions handle variable-length samples. See here: https://github.com/apple/ml-cvnets/blob/main/data/collate_fns/byteformer_collate_functions.py

For the PNG experiments, the inputs should be the same length. The reason yours aren't is due to a bug, where compress_level was not being passed as a kwarg. As mentioned in our paper, we deactivate zlib compression in our experiments - this bit of code was missing. Note, our experimental results in the paper were not affected (the bug was introduced later, when refactoring the code).

The fix is here: https://github.com/apple/ml-cvnets/commit/fc62a84d98731764d9ef2ce284c4cdcab05fea4a . Please make sure to apply it to your experiments. You can just pull the update that I pushed to the main branch.

DanJun6737 commented 10 months ago

Thank you very much for your answer.