-
I just reimplement GIANT in Spark and test it on real-world data for CTR prediction which is very high dimensional d = 2^26.
In order to have a better hessian approximation, each partition of RDD has…
-
Traceback (most recent call last):
File "/home/zgd21/traffic/DGCN-master/GRCN.py", line 176, in
compute_val_loss(net, val_loader, loss_function, supports, device, epoch=0)
File "/home/zgd2…
-
### 🐛 Describe the bug
**To Reproduce**
1. Run the following script:
```
import torch
import torch.nn as nn
device = "cuda"
N, d = 16384 + 1, 384
X = torch.randn(N, d, requires_grad=True, de…
-
**Describe the bug**
Trying to ping the bridge IP from the target was not working, I could not see anything with tcpdump in the proxy. But pinging the target from the proxy was working. While it ping…
-
### Describe the bug
below code works well in ipex version 2.0.0 but fails in 2.0.100
```python
import os
import numpy as np
from torch import nn
import torch
import torch.nn.functional as F
…
-
I have creted noise2inverse environment as mentioned and the problem is happening in 3.train code.
It is giving me this output when I tried to run # Option a) Use MSD network. Error output:
```
…
-
I have the same problems as https://github.com/tflearn/tflearn/issues/223 mentioned. But mine is related to multi-class classification.
self.metric = self.model.evaluate(self.case.X_test, s…
-
@VainF I have trained a custom YOLOv8 model. After training i have successfully pruned the model.
```python
for name, param in model.model.named_parameters():
param.requires_grad = True…
-
### 🐛 Describe the bug
torchrun Multi machine and multi card training error.
Both Rank1 and Rank2 can be trained normally.
The error occurs after the successful establishment of nccl communicatio…
-
I have a machine I want to LTO and was wondering if anyone would be interested in some before and after benchmarks. If so what would be useful to run?