-
-
Thanks for your wonderful work of SimA! And I'm trying to replicate your DeiT-S->SimA result but I don't find any hyperparameters settings. Is all hyperparameters inculding drop-path are the same with…
-
The 10-blocks lock is the most problematic aspect of Monero, which heavily impacts its usability as a currency and the user experience of people using services based on Monero. #95 explores the possib…
-
Hi folks! My team and I are looking into having compiler support for block floating point (BFP) in Torch-MLIR. Wondering what you think about extending the Torch-MLIR support for these cases. Below is…
Svoch updated
2 years ago
-
**Describe the bug**
I am getting the following error while attempting to run deepspeed-chat step 3 with the actor model CarperAI/openai_summarize_tldr_sft (gpt-j 6B) and critic model CarperAI/openai…
-
PyMC v4 has a JAX backend and can use samplers like those from numpyro or blackjax, it should be pretty easy thus to add an example of how to use SGMCMCJax with a PyMC model.
https://github.com/pym…
-
Hi,
I'm currently trying to solve 2 problems I have:
1.) One of the programs I'm trying to tune has the contraint that the dimension must be divisible by the block size (`dimension % block_size …
-
# Overview
## Setting
- We generate a network using the stochastic block model with three blocks, featuring inter-intra edge densities of 0.01 and 0.3 for well-separated communities.
- A random w…
-
Hi!
I want to add examples to my functions, but in many cases it is hard or not necessary to test the expected output. However, it may still be useful to test whether the functions/examples runs th…
-
Hi,
I am getting the following error when running pretrain_gpt.sh
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
----------------------------------…