-
There has been a completed merge of mamba model support over at Ilama.ccp, would it be possible to implement these into Ollama as well?
Merged PR: https://github.com/ggerganov/llama.cpp/pull/5328
…
-
search for different architecture choices and evaluate (theoretically) whether they'd be suitable for our task. collect pro / contra arguments. keep the runtime (realtime!) in mind.
example archite…
-
似乎并没有看到有人有人将rwkv或者retnet用于ocr任务,对于较长的文本,例如2048或者4096而言,解码是一个成本较高的事情,但是如果将解码器换成rwkv那么对于长度、成本和速度都是一个非常好的解决方案。但是我查找了一些资料,并没有看到有人这样做,我尝试这样做但是没能理解用法
是否愿意出一个解码器教程或者帮我重构一下代码,我相信rwkv在ocr领域应该是一个冉冉升起的新星
-
https://github.com/Jamie-Stirling/RetNet/blob/2acf026fc8435635051149d9bef793cae7f3d7af/src/retention.py#L104
You should change the device of these tensors in order to match the model device.
When …
-
When I am trying to train retinanet model(X-101-32x8d-FPN) on my own dataset, I am getting the below error.
AssertionError: Workspace blob retnet_cls_pred_fpn3_w with shape (72, 256, 3, 3) does not …
-
问题类型:其它
**问题描述**
========================
**ppyolo训练的模型想要进行参数敏感度分析的时候遇到了如下问题**
![image](https://user-images.githubusercontent.com/66171021/125725520-dbc64028-443b-4b47-9ff4-089d06efa857.png)…
-
# AI Forum 2023 | Future of Foundation Models
Link: https://www.youtube.com/watch?v=f6m0MpbNicU&list=WL&index=15&t=9s&ab_channel=MicrosoftResearch
This presentation starts by raising questions a…
-
We need to be consistently logging the same values for all types of models. For the LongNet architecture, for example, [we only log Validation loss](https://github.com/DRAGNLabs/301r_retnet/blob/c934c…
-
In the RetNet model, embed _ tokens is not given, I can 't run the code. When I use this model, what should the parameter token _ embeddings pass ? Or how do I define embed _ tokens ?
-
Excellent, Can you write a demo example on training ? Comparing to the Microsoft code have you check that it provide same number of parameters for the same settings ?