-
We have an error when we run the model with sharding overrides.
Here, we run simple MNIST benchmark model with passed sharding overrides through the tt-forge-fe. All configurations/overrides are hard…
-
Hi,
I am a new user of Catboost and I was wondering if it is possible to implement model tree in catboost (or in gradient boosting regression trees in general).
My feeling is that by using linear …
-
Ideas to develop about how to measure the influence of data points in DHARMa, something like Cook's distance, in a simpler and more general way.
Some references:
- Nieuwenhuis, R., Grotenhuis, M.…
-
Hi, could you give me a tutorial on how to use your proposed models to predict gene expression?
-
hi how to get llama-3.2 to work with ipex_llm ?
here's my code.
```
import requests
import torch
from PIL import Image
from transformers import MllamaForConditionalGeneration, AutoProcessor
imp…
-
Hello, I have some questions about the details related to dire ft in the paper. Is Dire FT using new categories of data (such as bedrooms mentioned in the article) to continue training on the original…
-
### 🐛 Describe the bug
I'm trying to train LLaMA model with all linear layers + embeddings and head.
Whilst embeddings have no problems with FSDP over Liger, there always exceptions when [ lm_head…
-
I tried to compare the speed difference of the same model on CPU and GPU. Here is my code:
```
import numpy as np
import pandas as pd
import time
import cudf
import cupy as cp
from cuml.linear_model i…
-
### pycaret version checks
- [X] I have checked that this issue has not already been reported [here](https://github.com/pycaret/pycaret/issues).
- [X] I have confirmed this bug exists on the [latest…
-
### Feature request
Enable PPOTrainer and DPOTrainer to work with audio-language models like Qwen2Audio. Architecture for this model is identical to vision-language models like LlaVa, consisting of…