llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

DLR-RM/stable-baselines3 #1704

[Feature Request] Add support for gymnasium.spaces.Text

### 🚀 Feature Unless I'm mistaken, stable-baselines3 only supports Box, Discrete, MultiDiscrete, MultiBinary and Dict spaces from gymnasium. It seems like a new _fundamental space_ has been introd…

jvasso updated 1 year ago
1
jasonvanf/llama-trl #5

tuning_lm_with_rl.py does not appear to have a file named c…

Hi Jason, I followed the steps Step 1 - Supervised Fine-tuning, generate "/checkpoints/supervised_llama/" including folders: ``` checkpoint-2000 checkpoint-3000 checkpoint-4000 final_checkp…

judyhappy updated 1 year ago
2
philschmid/llm-sagemaker-sample #16

Fine Tuning Mixtral 8x7b

Hi there, Thanks for the scripts and posts! I am interested in fine-tuning Mixtral 8x7b on sagemaker. The task I have requires around 8k token length. I have tried running training following th…

BedirT updated 7 months ago
4
ROCm/TransformerEngine #79

[FSDP 8xMI300X] Llama3 8B FP8 is 21% slower than BF16 & OOMs…

### Problem Description Llama3 8B FP8 OOMs at the same batch size as BF16. I need to decrease the batch size to `2` for it to not OOM. At batch size 2, TE FP8 is **21% slower** than torch compile B…

OrenLeung updated 2 weeks ago
6
AI4Finance-Foundation/FinGPT #62

training resources

Hello, I would like to ask can the full training of fingpt be done on colab with A100? How long will it take.

itlittlekou updated 1 year ago
3
lebrice/SimpleParsing #275

HF Trainer TrainingArguments can't be used with `default_fac…

**Describe the bug** I've really been loving using simple-parsing in my projects. It looks like you are trying to maintain compatibility with hugging faces dataclass #172. One use case I've been tryi…

levmckinney updated 1 year ago
1
SeanLee97/AnglE #82

Use of causal models for generation

This is an amazing work. I have been working on something that would require me to evaluate the generated outputs of models like Mistral, using a prompt like: `"Fill the [MASK] token in the sentence.…

dipankarsrirag updated 4 months ago
3
ManifoldRG/NEKO #40

Resource Analysis

We need more compute & storage than is individually available to us via local GPUs to train the MVM outlined on our Roadmap: To engage with any potential compute & data storage providers who may be i…

harshsikka updated 11 months ago
6
XpressAI/xai-llm-server #2

Feature Request: Add support for Llama-3.2-11B-vision/

### Problem We want to add support for this new model that unlike the previous ones also supports vision. The readme for the model is described below: --- language: - en - de - fr - it - pt…

wmeddie updated 1 month ago
3
OpenInterpreter/open-interpreter #1415

recommendation for making 'gpt-4o-mini' the default model ov…

### Is your feature request related to a problem? Please describe. I saw that 4-turbo is currently the default model, the issue that it's outdated and expensive, which may lead to a negative first …

drhouse updated 1 month ago
7

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for llm-training

1000+ results
for llm-training