-
I was use this network trained on image defect classification task, and it was very hard train, and get low acc, but other model, like VIP model based on mlp architecture,or pure resnet50,those model …
-
-
Hi thanks for the sharing of your code. In the paper, you implemented a baseline called "BERT+MLP", reaching a **76.2** F1 score. But when I use the same architecture, I cannot get the same result. Di…
-
I am currently attempting to port a llama-like model architecture from pure pytorch to TransformerEngine's pytorch classes.
However, I have been unable to obtain identical results in certain cases.…
-
### System Info
Hardware: L20
Version: 0.11.0.dev20240625
Model: Bloom7b1
### Who can help?
@ncomly-nvidia @byshiue
I have obtained the Medusa head for Bloom according to the official M…
-
**Describe the bug**
The provided pre-trained Swin-UNETR weights do not load into a newly instantiated SSLHead model object. The naming scheme for the model state_dict keys is different between the p…
-
` # shape reconstruction loss
rebuild_points = nbr_groups[0] + center_groups[0].unsqueeze(-2)
idx = pointops.knn(center_groups[0], pred, int(self.nbr_ratio * self.group_size))[0]
…
-
I aim to the following sigmoid-outputting models for team formation:
1. Neural Collaborative Filtering (NCF) with MLP/FNN:
- NCF has demonstrated success in recommendation systems and collaborat…
-
How can I use different language models from Hugging Face for knowledge distillation in this set up?
-
### Question
Hello, I was trying to get a sense of the number of params. of LLaVA 1.5. I understand that the LLM used is Vicuna 1.5 (either 7B or 13B) and that the vision encoder is CLIP ViT-L/14 336…