If I want to utilize one of the industrial sensors as this multi modality model, how to get started? for example, replacing audio modality by industrial (analog to digital conveted) sensor if I have corresponding images to augment.
Hi there, I recommend you to checkout our project ViT-Lens. We open-sourced the training code and you may take a look at the audio part for your customized application.
Hi,
If I want to utilize one of the industrial sensors as this multi modality model, how to get started? for example, replacing audio modality by industrial (analog to digital conveted) sensor if I have corresponding images to augment.