torchscale Search Results

124 results
for torchscale

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Jamie-Stirling/RetNet #1

About the complex

Sorry for bothering you and this may be a dumb question: The Complex type in here is for what? I'm not very good at math and if you guys can explain why we need to use complex it will be good.

KohakuBlueleaf updated 1 year ago
4
microsoft/torchscale #74

BEiT3 Vision-Language Expert question

Hello :smile: The BeIT3 paper mentions that Vision-language experts are employed in the top three Multiway Transformer layers. However, by taking a look the MultiwayNetwork implementation, I find …

andreapdr updated 1 year ago
4
syncdoth/RetNet #13

Can you provide a LICENCE file

I would like to use this repo for my job. I cannot do so until you add a license to the repo. Can you please do so soon?

Shubhankar-Aidetic updated 1 year ago
2
syncdoth/RetNet #8

Question about verifying the Inference Latency

Hi, Thank you for your great work! When I use your example code to compare the Inference Latency with Transformer-based LLM, the result is not as expected in the paper (15.6X). Could you please …

LiZeng001 updated 1 year ago
3
microsoft/torchscale #41

can not download dict.txt

when link https://publicmodel.blob.core.windows.net/torchscale/vocab/dict.txt This XML file does not appear to have any style information associated with it. The document tree is shown below. Pu…

robotzheng updated 1 year ago
2
microsoft/torchscale #43

AttributeError: 'EncoderConfig' object has no attribute 'dec…

. . Hi, I plan to reproduce the results of the WMT-17 translation task as presented in the deepnet paper. Could you please let me know what the command for running the script shou…

dedekinds updated 1 year ago
2
kyegomez/LongNet #2

Basemodel usage

Hey kyegomez, I'm interested in trying out the implementation. Is it already possible to use a basemodel for this?

PriNova updated 1 year ago
2
microsoft/unilm #1247

[Suggestion] This repo is too heavy

**Describe** If I only want to use one of the models in the repo, I have to download the whole repo. But this is not neccessary. It is difficult to download the whole repo quickly in a short period…

SWHL updated 1 year ago
1
microsoft/torchscale #55

Retnet training is slow

Hi, when I use retnet's parallel mode to train, it's very slow, I observe the gou memory usage, it's very small, what's going on? Thank you! ```[tasklist] ### Tasks ```

Zth9730 updated 1 year ago
2
microsoft/torchscale #32

EncoderDecoder Configuration Issue

I've identified a problem in the `EncoderDecoderConfig` class within the `architecture` module of the `torchscale` package. The `EncoderDecoderConfig` class currently does not contain the `normali…

klae01 updated 1 year ago
1

上一页 1...7 8 9 10 11 12 13...13 下一页

124 results for torchscale

124 results
for torchscale