multi-modal Search Results

open-mmlab/OpenPCDet #1681

How to implement multi-modal PointPillar

I would like to use PointPillar and ResNet to fuse image and point cloud features, but I don't know how to implement the code, and I request a code reference.

JunjieWang0528 updated 1 week ago

karthink/gptel #459

Multi-modal context management within gptel.

First up I want to say: *GPTel is fantastic - it accelerates my Emacs workflow no end.* I want to thank you for creating this tool, in the way you have; lightweight and seamless, across the pano…

metachip updated 2 days ago

vllm-project/vllm #10114

[RFC]: Merge input processor and input mapper for multi-moda…

## Motivation ### Background To provide more control over the model inputs, we currently define two methods for multi-modal models in vLLM: - The **input processor** is called inside `LLMEngi…

DarkLight1337 updated 3 days ago

microsoft/semantic-kernel #9806

.Net: New Feature: MistralAI - Multi-Modal Chatcompletion Su…

--- name: MistralAI - Multi-Modal Chatcompletion Support about: Add support for multimodal chat completion for the MistralAI connector. --- Hey there, i played a bit with the .Net Connector fo…

tntwist updated 21 hours ago

xp1632/DFKI_working_log #75

`MLMM: Multi Modal Large Language Model`

- Here's the summary of consulting a LLM specialist: --- - We have an initial thought in #74 as follows: ![image](https://github.com/user-attachments/assets/265a3d7d-0454-4e7b-9c99-a0dd9f9ecf7c…

xp1632 updated 4 days ago

vllm-project/vllm #4194

[RFC]: Multi-modality Support Refactoring

[[Open issues - help wanted!]](https://github.com/vllm-project/vllm/issues/4194#issuecomment-2102487467) **Update [11/18] - In the upcoming months, we will focus on performance optimization for mul…

ywang96 updated 3 days ago

Arize-ai/openinference #495

🗺️ Vision / multi-modal

GPT 4o introduces a new message type that contains images and coded as either URL or base64 encoded. example: ```python from openai import OpenAI client = OpenAI() response = client.chat.…

mikeldking updated 1 month ago

Arize-ai/openinference #1081

[multi-modal] scope out video / audio semantic conventions

axiomofjoy updated 3 weeks ago

rellfy/openai #42

Implement Multi-Modal Completions (Vision)

### Problem I would like to build a tool that submits text and images to the OpenAI endpoints, so that I can implement some content moderation. The Vision API is specified [here](https://platform.…

Twister915 updated 2 months ago

Yuliang-Liu/Monkey #154

关于minimonkey多图片的微调问题

![image](https://github.com/user-attachments/assets/73b7531d-c30e-4841-9b86-d8d8e2c97357) 开源的代码中只有multi_modal_get_item存在dynamic_preprocess2。 1.请问minimonkey支持多图片的微调么？ 2.如果想要改动代码进行多图微调的话，是否将multi…

AIaimuti updated 3 weeks ago

1000+ results for multi-modal

1000+ results
for multi-modal