-
-
### Problem
I would like to build a tool that submits text and images to the OpenAI endpoints, so that I can implement some content moderation.
The Vision API is specified [here](https://platform.…
-
![image](https://github.com/user-attachments/assets/73b7531d-c30e-4841-9b86-d8d8e2c97357)
开源的代码中只有multi_modal_get_item存在dynamic_preprocess2。
1.请问minimonkey支持多图片的微调么?
2.如果想要改动代码进行多图微调的话,是否将multi…
-
### Feature request
Implement the new feature to support a pipeline that can take both an image and text as inputs, and produce a text output. This would be particularly useful for multi-modal tasks …
-
Submitting Author: Tharsis Souza (@souzatharsis)
Package Name: podcastfy
One-Line Description of Package: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with Gen…
-
I hope you're well. Please share your valuable insights. I really appreciate it.
I have a python function that sends images to lm studio and it works good when it is called from a tester code.
But…
-
How to train the model? I tried but it could not converge. Thank you very much if you can tell the concrete details.
-
Update the following examples/demos that use the deprecated modal to use the new one
- Label
- Wizard
- Multi file upload
- date picker
-
![image](https://github.com/user-attachments/assets/e2af91d7-bd06-409d-8abd-fed0d766c213)
Hi, I was wondering if nnUNet supports 4-channel input. My task is to generate a single-channel output (seg…
-
I tried the multimodal code example with gpt-40 and the output has no relation to the provided image while the code does not raise any error.