-
## Description
The Gluon Trainer `step` method uses enumerations as keys to push and pull gradients/parameters from kvstore. Using two trainers within a single worker script (in a distributed learnin…
-
## 🐛 Bug
[Original Post]
I noticed a significant slowdown of my training script this morning after upgrading from 1.6 to 1.7. GPU memory usage also has a noticeable increase.
## To Reproduce
…
-
I tried to use the BERT-345M-uncased model for ICT pretraining but an error is occurring, the complete log is given below.
I am thinking that this model is not compatible for the task, I wasn't able …
-
Hi @choosehappy @jacksonjacobs1 @nanli-emory
Finally some relief (and your inputs are appreciated), the CuCIM supports to both image handle and QC modules are implemented.
#### Unified Interface …
-
@gregcaporaso, @audy and had a call earlier today. Just putting down some notes from the call:
- targeting marker gene
- primary goal is a scikit-learn implementation of the Naive Bayes classifier use…
-
Trying to use embeddings to compute cosine similarity. The problem I am getting is there no way to pass the embedding as a param to invoke the following feature during logging.
```json
{
"nam…
-
There is a very common use case of LTR plugin when somebody adds several features which may use common parts. For example somebody wants to have feature matching specific document field (which could b…
-
I hit upon an error in HuggingFace for which there are strangely zero google search results
"ValueError: Calculated loss must be on the original device" I can see this error source code in huggingf…
-
* megatron version: v2.6
* pytorch version: 1.10.0+cu111
* cluster: 2 nodes, A100*8 for each node
* script: [pretrain_t5_distributed_with_mp.sh](https://github.com/NVIDIA/Megatron-LM/blob/main/exam…
-
Title: Managing Popularity Bias in Recommender Systems with Personalized Re-ranking
Venue: AAAI Florida Artificial Intelligence Research Society(FLAIRS’19)
Year: 2019
**Main problem:**
Collabo…