-
Arrived here via instructions at
https://www.previousnext.com.au/blog/join-us-drupalgov-2020-code-sprint#set-up-a-development-environment--2
that use a gist that references `previousnext/php-apache:…
-
# Description
The first huge difficulty for training an AI assistant is to get a dataset reach enough and big enough for starting the training at all.
ChatLLaMA needs three different type of da…
-
[issue]
The fine-tuning step doesn't increase the scores (it even decreases the score).
Please refer to the green line in the chart below.
![image](https://user-images.githubusercontent.com/39104…
-
I know it's mentioned in the readme of the repo that this model apparently can't code because of the spaces that are merged. And this has been discussed in #40 .
However, I did some fine-tuning on …
-
### 📚 The doc issue
Currently, our load and save models involve both PyTorch and HF formats, and instructions for stage123 and inference need to be added to avoid misunderstandings and usage by users…
-
### Contact Details
Here, on GitHub, preferrably.
### Version
5.7.4
### Description
The build fails with error messages like the following on aarch64 Linux machines, which should compile wolfssl …
-
### Model description
Here is the model description
> gte-Qwen1.5-7B-instruct is the latest addition to the gte embedding family. This model has been engineered starting from the [Qwen1.5-7B](https:…
-
llm embed has the following training script. I don't know how to adjust hyperparameters like train_batch_size, learning rate, warmup_ratio, ...
torchrun --nproc_per_node=8 run_dense.py \
--output_…
-
Hi, I've been reading your paper and I find it amusing that by grokking, transformer model could reach high accuracy even to evaluate data. Completely different from I know that overfitting is bad. I …
-
### Question
I have two questions.
1. I follow the instruction in scripts/v1.5 to pre-train and fine-tune the model. After pre-training, I get the mm_projector.bin; and after fine-tuning I get adap…