microsoft / DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
MIT License
2.19k stars 290 forks source link

Check failed: rank <= DML_TENSOR_DIMENSION_COUNT_MAX #445

Open Djdefrag opened 1 year ago

Djdefrag commented 1 year ago

Hi, i just updated to version 0.2.0.dev230426 (with previous versions 0.1.13.1.dev230413 there was not this problem) and now it shows this error: Error: [F D:\a_work\1\s\pytorch-directml\caffe2\dml\dml_tensor_desc.cc:81] Check failed: rank <= DML_TENSOR_DIMENSION_COUNT_MAX

Windows 10 Python 3.10.10 torch-directml 0.2.0.dev230426 AMD Rx 6600

class RIFEv3: def init(self, backend, local_rank=-1): self.flownet = IFNet(backend) self.device(backend) self.optimG = AdamW(self.flownet.parameters(), lr=1e-6, weight_decay=1e-4) self.epe = EPE() self.sobel = SOBEL(backend) if local_rank != -1: self.flownet = DDP(self.flownet, device_ids=[local_rank], output_device=local_rank)

def eval(self): self.flownet.eval()

def device(self, backend): self.flownet.to(backend, non_blocking = True)

def inference(self, img0, img1, scale=1.0):
    imgs = torch.cat((img0, img1), 1)
    scale_list = [4/scale, 2/scale, 1/scale]
    _ , _ , merged = self.flownet(imgs, scale_list)
    return merged[2]


- In particular it gives this error on:
`mid_image = AI_model.inference(first_img, last_img)`

AI_model is an instance of the classe RIFEv3

Thank you

EDIT: it seems happening with other AI inference too :(
Adele101 commented 1 year ago

Hi @Djdefrag, thank you for submitting this issue. While I can't provide a timeline for resolution as the moment, please know that your feedback is valuable to us. We will follow up once we can review this issue.