Is your feature request related to a problem? Please describe.
The feature request is to have multi modal capability inside ludwig using llama-next
Describe the use case
The basic use cases are to add imagery/video on top of text and the llm workflow for training and inference
Describe the solution you'd like
This PR will add the shell code to embed mulitmodal functionality into ludwig leveraging the current code thats already in there
Is your feature request related to a problem? Please describe. The feature request is to have multi modal capability inside ludwig using llama-next
Describe the use case The basic use cases are to add imagery/video on top of text and the llm workflow for training and inference
Describe the solution you'd like This PR will add the shell code to embed mulitmodal functionality into ludwig leveraging the current code thats already in there
Describe alternatives you've considered N/A
Additional context None