-
-
- layout option doesn't work (remove it?)
- retcol option doesn't work, this should be solved
- arrowheads obtained with the arrow=T options have a weird aspect (this not so important though..)
- r…
-
Hello! Do you plan to add Mamba 2 to your repo? If so, any estimate on when we can expect it?
-
一直有一个疑惑,RetNet提到的MSA具有类似RNN那样的性质,使得在并行和串行的条件下输出是一致的,也就是关于D矩阵的构造,请问RMT对这个进行修改之后,MaSA具有类似的性质吗,如果没有,那么MaSA和Transformer的多头注意力的区别是否只在于对于多头不同衰减的mask呢
-
I have started a training with image-net weights to test the `training-schedule` branch. I will update this issue regularly to report the findings.
Training setting is:
`dataset` : COCO
`batch-si…
-
Hi, I see that the RecurrentCache was renamed to Cache for gla model. However, it raised error as Cache does not have method “from_legacy_cache”.
OREYR updated
3 months ago
-
Thanks!
-
### What feature or new tool do you think should be added to DevToys?
Tools like Copilot, ChatGPT or BingChat are truly helpful from a developer perspective.
### Why do you think this is needed?
Ha…
veler updated
5 months ago
-
### 🐛 Describe the bug
I have a Mac M1 GPU and I've been trying to replicate the results in [this google colab notebook](https://colab.research.google.com/drive/1_X7O2BkFLvqyCdZzDZvV2MB0aAvYALLC) on …
-
/runs simply holds files for tensorboard data for given models. Moving /runs into each model (and probably naming it something like /tensorboard) would be superior.