-
### Title of the resource
Text to Video Prompt Engineering Intensive
### Resource type
None
### Authors, editors and contributors
Emily Genatowski
### Topics (keywords)
AI, Large Language Model…
-
# Papers
- Sapiens: Foundation for Human Vision Models
- 메타에서 나온 Human foundation model ㄷㄷㄷ
- 2D pose estimation, body-part segmentation, depth prediction and normal prediction이 하나의 모델에서 …
-
Model:
- ModelScope: https://www.modelscope.cn/models/iic/mPLUG-Owl3-7B-240728
- Huggingface: https://huggingface.co/mPLUG/mPLUG-Owl3-7B-240728
Usually, fine-tuning a multimodal large model invol…
-
### Project Name
educAIte
### Description
## Project Overview
EducAIte is a web application designed to simplify text extraction and document interaction, specifically for educational purposes. By…
-
**Is your feature request related to a problem? Please describe.**
In today's interconnected world, language barriers significantly restrict access to global information. Users often miss out on va…
-
# Keywords
RoBERTa, Language model, Domain-adaptive pretraining, Task-adaptive pretraining
# TL;DR
Multiphase adaptive pretraining with domain and task corpus offers large gains in task performance…
-
#WIP
## Benchmark with [faster-whisper-large-v3-turbo-ct2](https://huggingface.co/deepdml/faster-whisper-large-v3-turbo-ct2)
For reference, here's the time and memory usage that are required to tr…
-
### Project Name
PhotoRAG
### Description
PhotoRAG is a fullstack Next.JS image search application that leverages Azure AI and infrastructure to implement a Retrieval-Augmented Generation (RAG) sys…
-
There have been many discussions in the community regarding support for multiple models.
- ChatGPTNextWeb#3484
- ChatGPTNextWeb#3923
- ChatGPTNextWeb#960
- ChatGPTNextWeb#3431
- ChatGPTNextWeb#…
-
"In practice, to save GPU memory, we do not load all Encoders directly onto the GPU but instead load the extracted features“
Does it mean we don't need modality encoder, we already have the llama inp…