Open dbsdmlgus50 opened 5 months ago
Thank you for your question. The codebook is controlled by two parameters, namely 'seg_token_num' and 'image_feature_scale_num'. The product of these two parameters determines the number of tokens to be added to the vocabulary of the LLM. This functionality is implemented around line 169 in the 'train_ds.py' file. The codes in the codebook used for segmentation are all randomly initialized.
is the codebook used as llm input, or just used for mask decoder?
@MaverickRen Are 'seg_token_num' and 'image_feature_scale_num' corresponding to the variables N and L in the paper, respectively? Thank you!
Thank you for conducting and sharing such good research!
I couldn't find anything in the current code that corresponds to codebook. Is there a code for codebook by any chance? I have additional questions about codebook.
Is codebook a pre-constructed element from the image we will learn?
Thank you.