-
While running a llama2 pretraining script with specific configurations, I encountered an illegal memory access error. The detailed error message is as follows:
```
[2023-08-09 07:56:18,503] [INFO]…
-
## Abstract
- Adaptive methods such as Adam, Adagrad, RMSprop performa well in initial portion of training, but have been found to generalize poorly compared to SGD at the end
- Propose SWATS, a sim…
-
There is (I guess) a copy-paste bug in dnn.py
```
self.targets = tf.get_collection(tf.GraphKeys.TARGETS)
if len(self.inputs) == 0:
```
The second line should be `if len(self.targets) == 0:`
**One …
-
Hello, I have a MobileNetV2 That I am trying to use for image classification by means of transfer learning, although apparently seems to not work. Initially, I perform transfer learning on my model a…
-
**Environment:**
1. Framework: PyTorch 1.8.0
3. Horovod version: 0.21.3
4. MPI version:
5. CUDA version:
6. NCCL version:
7. Python version: 3.9
10. OS and version: Linux 5.11.10-arch1-1 #1 SMP…
-
## How pytorch optimizer work?
https://mcneela.github.io/machine_learning/2019/09/03/Writing-Your-Own-Optimizers-In-Pytorch.html
![image](https://user-images.githubusercontent.com/67103130/185775922…
jl749 updated
4 months ago
-
I'm using pytorch `0.4.0a0+792daeb` and met the following attribute error:
```
Traceback (most recent call last):
File "capsule_network.py", line 267, in
engine.train(processor, get_iterato…
-
**Describe the bug**
I try to finetune `llama3-8B` model with multi nodes but get an AtrributeError when finishing loading mcore format checkpoint and starting to build datasets, the error is below:
…
-
Hi, I want to train mask rcnn with rle loss!
1. Could you give me some details about the optimizer and learning rate scheduler used in your experiments? I found rle uses the adam while the original m…
-
here is my code:
from tf_unet import unet, util, image_util
#preparing data loading
search_path = 'data/train/*.tif'
data_provider = image_util.ImageDataProvider(search_path)
#setup & train…