Closed gkv91 closed 7 months ago
Hi, it is possible! You will need to go a bit of hack on your own. Specifically you need to remove the pooler_output
in here (and change to a key that corresponds to return shape of BxTxD): https://github.com/LAION-AI/CLAP/blob/main/src/laion_clap/clap_module/model.py#L627
Hi,
Thanks for sharing the work.
Currently, the model.get_text_embedding is giving one embedding (512D) per sentence. How can I extract token level embeddings (i.e., n_tokens x 512D)?
Thanks, Goutham