-
-
-
Here is one piece of code In the file of mergekit/mergekit/moe/qwen.py
`for model_ref in (
[config.base_model]
+ [e.source_model for e in config.experts]
+ [e…
-
Hi,
I use rtx 3090 and got this warning which is not supposed to appear. when I use tiny-cuda-nn on other project, I got warning "tinycudann was built for lower compute capability ({cc}) than the sy…
-
Trying to deploy and run demo on a 4 A6000 cluster but it seemed that the runtime froze without any exceptions... Could there be any possible problems? Sorry for asking a naive question and thanks for…
-
### System Info
Hardware: L20
Version: 0.11.0.dev20240625
Model: Bloom7b1
### Who can help?
@ncomly-nvidia @byshiue
I have obtained the Medusa head for Bloom according to the official M…
-
-
Our current architecture in sample-factory is just an MLP encoder; I suspect a permutation invariant or GNN-based architecture would be better
-
## Description
The KAN (Kolmogorov Activation Network) model from the pykan library currently only supports two-dimensional input tensors (batch_size x hid_dim). A `RuntimeError` is raised when att…
-
If you open a GitHub issue, here is our policy:
It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below…