multi-modal Search Results

1000+ results
for multi-modal

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

magicleap/SuperGluePretrainedNetwork #155

Has anyone tried SuperGlue for multi-modal feature matching?…

jediofgever updated 1 week ago
1
modelscope/ms-swift #1857

🎉Support for finetuning of Qwen2-VL-Chat series models

🎉The finetuning(VQA/OCR/Grounding/Video) for Qwen2-VL-Chat series models has been supported, please check the documentation below for details: # English https://github.com/modelscope/ms-swift/blob/m…

tastelikefeet updated 2 months ago
17
jasonjmcghee/xrem #12

Multi-Modal Support for Enhanced Retrieval

Is there a plan to incorporate image embeddings along with OCR and metadata-based retrieval? Utilizing the CLIP model from Candle to generate image embeddings could provide clearer context and improve…

Rajaniraiyn updated 5 months ago
5
NVlabs/VILA #130

How to run longvila large context, sequence parallel inferen…

There are multiple mentions of a multi modal sequence parallel system for inference which can be seamlessly integrated with HF transformers. However, I am not able to follow this through the codebase …

zadeismael updated 2 weeks ago
20
projectblacklight/blacklight #3236

Default behavior of Advanced Search facet modal does not ref…

On the advanced search page, we render facets for the user to refine their search. These facet fields, unlike facet fields on basic search, allow for multi-select. When facets are limited (which they …

body-clock updated 1 month ago
3
microsoft/vscode-pull-request-github #6426

Unable to authenticate using Github Enterprise

- Extension version: v0.99.2024101604 (pre-release) - VSCode Version: 1.94.2 - OS: Windows 10 - Repository Clone Configuration (single repository/fork of an upstream repository): Multi-repo, multi…

wipeout630 updated 3 weeks ago
2
poboisvert/GPTARS_Interstellar #2

Feature: Add a camera and microphone so TARS can see you and…

GPT models are now multi-modal so would be nice if the cad file had a spot for a camera that could be connected. Same goes for the microphone.

rkeshwani updated 1 day ago
7
Farama-Foundation/Arcade-Learning-Environment #577

[Discussion]: Audio Query Support to master

Providing an interface to query the raw audio samples is a very useful feature for multi-modal research. The topic has been discussed before, with feature support added here: https://github.com/Fa…

gkennickell updated 1 day ago
5
haotian-liu/LLaVA #1304

[Discussion] ms-swift对于Llava系列微调(finetune)的支持

### Discussion Llava仓库本身已经提供了非常优秀的微调脚本. ms-swift多模态大模型微调框架集成了Llava的推理与微调, 并书写了最佳实践: https://github.com/modelscope/swift/blob/main/docs/source/Multi-Modal/llava%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.…

Jintao-Huang updated 1 month ago
1
open-mmlab/mmengine #1584

[Feature] Request for Multi-Input Support in BaseModel

### What is the feature? ### Description The current implementation of `BaseModel` in mmengine assumes a single `inputs` parameter of type `torch.Tensor` in the `forward` method: ```python def…

shenshanf updated 1 month ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for multi-modal

1000+ results
for multi-modal