-
I ran the example program and got the following error.
```
import torch
from long_net.model import LongNetTransformer
longnet = LongNetTransformer(
num_tokens=20000,
dim=512,
dep…
-
We need to be consistently logging the same values for all types of models. For the LongNet architecture, for example, [we only log Validation loss](https://github.com/DRAGNLabs/301r_retnet/blob/c934c…
-
Commit a9dc6c7 from #63 breaks eval_suite.py.
It introduces the following error:
```
ValueError: The config you are passing has a `model_type` attribute that is not consistent with the model type…
-
(venv) personalinfo@MacBook-Pro-3 LongNet % python3 train.py
2024-03-05 23:56:10,524 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
2024-03-05 23:56:17.908409: I tensorflow/core/platform/…
-
torchrun --nproc_per_node=8 --nnodes=1 train.py ../../../fairseq/data-bin/wikitext-103/ --num-workers 0 --activation-fn gelu --share-decoder-input-output-embed --validate-interval-updates 1000 --save-…
-
Hello Frank!
I love what you have created, and am having a great time going through and parsing through your implementation of the paper. It appears you have nailed the dilated attention calculatio…
-
https://arxiv.org/abs/2307.02486
Scaling to 1 billion context length paper in addition to this seems like it would solve the pursuit of infinite context length. Also FoT feels similar to L2P learn to…
-
I ran train.py and got error below
`Traceback (most recent call last):
File "/public/home/wangycgroup/public/02_Data/Internal/phage/train.py", line 86, in
loss = model(next(train_loader))…
ZTYyy updated
6 months ago
-
Hi, thanks for the great work! When fine-tuning GigaPath on our own dataset, it seems like the run is killed when the model is being set up during the train() call in training.py.
`# set up the m…
-
Hi,
ClassificationHead class in `classification_head.py` tries to freeze the parameters of `self.longnet.named_parameters()`, which does not exist. Instead, it should be changed to `self.slide_enco…