retnet Search Results - Githubissues

165 results
for retnet

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

syncdoth/RetNet #36

HuggingFace checkpoint

Hi! Congrats on the clean RetNet code. I'm experimenting with the torchscale codebase and happened to find your repo with a link to a checkpoint in HF. I noticed it is now gone, do you have plans t…

xtwigs updated 10 months ago
2
meta-llama/llama #713

Discussion: investigate requirements for unlimited context l…

Curious where to begin research for unlimited context length? Any direction appreciated.

ghost updated 1 year ago
2
YuchuanTian/DiJiang #4

Long inputs cause overflow / underflow

Thank you for sharing the implementation of the attractive work! When training DiJiang with long inputs (>5000), the outputs were NaN. This was due to an overflow, as D2 was defined as -n powers of…

yuji96 updated 3 months ago
3
syncdoth/RetNet #29

Added description for torch.compile

Thank you for the great implementation ! The specification of torch.compile is enabled by adding the statement "@torch.compile" just before every forward() function in modeling_retnet.py. ![image]…

ce-lery updated 11 months ago
1
syncdoth/RetNet #34

The number of parameters does not match the setting in paper

In retnet-3b/config.json, according to the experimental settings of the paper https://arxiv.org/pdf/2307.08621.pdf , set decoder_ffn_embed_dim and decoder_value_embed_dim to twice the size of decode…

ziHoHe updated 10 months ago
1
microsoft/torchscale #84

Question regarding the configuration of decoder_retention_he…

Thank you for your great work! I've noticed that your decoder_retention_heads is set to 3 by default, and the mask is also expanded to three dimensions to match. Have you experimented with the per…

Kratos-Wen updated 10 months ago
2
DRAGNLabs/301r_retnet #50

Checkpoints from the same epoch will overwrite one another

In `train_model.py` we have an issue where checkpoints saved from the same epoch will likely overwrite one another. It looks like line 137 can change this behavior. ` filename="epoch_{epoch}_validatio…

DrewGalbraith updated 7 months ago
4
grdspcht/ggret #30

Installation problems

- [x] Fails when Vignettes=TRUE - [x] update roxygen2 - [x] remove warnings - [x] remove commented code in geom ret - [x] convert names to snakecase - [x] In read_beast_retnet, move helper functi…

grdspcht updated 1 week ago
2
JoshVarty/pytorch-retinanet #2

Minibatch loader codepath

The [primary codepath](https://github.com/JoshVarty/pytorch-retinanet/issues/1) starts a number of threads that loads images from disk in minibatches. The minibatch loader codepath is much smaller,…

JoshVarty updated 5 years ago
7
syncdoth/RetNet #30

Info/Documentation on chunkwise training

Hi there. I want to understand how to use the RetNet to train a model with the longer context. It is not clear from available documentation how to train the model for a large context. There is no para…

pkpro updated 11 months ago
5

上一页 1...1 2 3 4 5 6 7...17 下一页

165 results for retnet

165 results
for retnet