-
I used 80 tasks from the file task_info_selected.csv in [this repository](https://github.com/ekinakyurek/marc/blob/main/task_info_selected.csv), and fine-tuned Meta-Llama-3-8B-Instruct using the train…
-
prepare_buckets_latents.py seems to not working with fluxx, is they're any way to generate this file for a flux full finetuning ? Thanks
-
This might be a silly question, but when using the Llama3.1 base model I can effortlessly pass in tools when running it in Ollama.
```
response = ollama.chat(
model='llama3.1'…
-
**Describe**
I found that after finetuning with Lora, the token throughput is significantly reduced. I trained a model on the unit test generation. And then fused the Lora adapter.
For my test dat…
-
During the use of LoRA fine-tuning, everything was normal, but the following issue arose during full-scale fine-tuning.
I use the following script for full fine-tuning :
```shell
#!/bin/bash
N…
-
Thank you great work!
I have a few questiones about peft? I hope you can answer that. Thank you a lot!
1. Which model fine-tuning is best to use? Is it a pre-trained model (llama2-7b) or a supervise…
zsxzs updated
1 month ago
-
Hi,
I am trying to fine-tune a Llama model with a large context size, and I found that to efficiently shard activations across multiple GPUs, I need to use Torchtitan. Here are some questions relat…
-
Hi everyone! First of all, thank you for the amazing work on sktime—it's an incredibly useful library.
I have a question regarding the ForecastGridSearch implementation. Specifically, I'm unsure fr…
-
So when I did fine-tuning of a llama3, my configuration file looks like:
```
# Tokenizer
tokenizer:
_component_: torchtune.models.llama3.llama3_tokenizer
path: ~/meta-llama/Meta-Llama-3-8B-In…
-
See section 4.3.1. There could be more than one instantiation of such a model.