-
The current experience with resume from checkpoint can improve. A few potential ways:
1. good defaults: Resuming from checkpoint should have as default using the last checkpoint saved, so the user …
-
In burn-train, several metrics can be used during training. It would be great to have more!
- [X] Accuracy
- [X] Loss (the one in use)
- [X] CUDA utilization (memory&compute)
- [x] Top-k accurac…
-
Hi,
I noticed that in your implementation of Tangram, you use the following code for training epochs:
` ad_map = tg.map_cells_to_space(adata_sc,adata_st,
mode="…
-
Problem:
Adding the **timeout** parameter to the .fit() method, that should force the library to return best known solution found so far as soon as provided number of seconds since the start of tra…
-
### Describe your issue.
I'm trying to use the shgo function from scipy.optimize to minimize a fairly complicated function of two variables. When computing the cost function, several intermediate res…
-
Because my desktop computer does not have a GPU, I would like to try using a CPU for training
-
### 🐛 Describe the bug
When trying to use torch.export.export_for_training using a sample model like:
```
class SampleModel(torch.nn.Module):
def __init__(self):
super().__init__()…
-
import logging
import os
import json
import torch
from datasets import load_from_disk
from transformers import TrainingArguments
from trl import SFTTrainer
from unsloth import FastLanguageModel…
-
### 🐛 Describe the bug
Converting GFPGANv1.pth original model to onnx gave me the following errors.
```
Traceback (most recent call last):
File "/home/batman/GFPGAN-Training-Models-To-Onnx/n…
-
### 🐛 Describe the bug
I have integrate our new accelerator to torch by utilizing torch.compile feature. And it works good for inference. Now am working on giving support for training but it gives m…