Critical Error!! , setting a variable does not work

liuliu / s4nnc

Swift for NNC

https://libnnc.org

BSD 3-Clause "New" or "Revised" License

70 stars 8 forks source link

Critical Error!! , setting a variable does not work #9

Closed ghost closed 1 year ago

ghost commented 1 year ago

import Foundation
import NNC
import PNG

let graph = DynamicGraph()
let  xIn_temp = graph.variable(.GPU(0), .NC(1,1), of: Float16.self )
xIn_temp.full(1) 
print( xIn_temp[0,0] )

i get all sorts of values when i print this. whaaat? running with MPS

ghost commented 1 year ago

import Foundation
import NNC
import PNG

let graph = DynamicGraph()

let arr = [Float16](repeating: 1.4 , count: 1)
let xIn_temp = graph.variable(Tensor<Float16>(arr, .CPU, .NC(1,1))) 

print( xIn_temp[0,0] )

let x_gpu = xIn_temp.toGPU(0)

print( x_gpu[0,0] )

.toGPU is broken on MPS. this seems like a very critical bug.

liuliu commented 1 year ago

Hi, thanks for reporting. I see what's going on with this and the other ticket on swift-diffusion now.

Basically, once a tensor moved to GPU, you cannot access individual scalar now (such as xIn_temp[0, 0]) because these will just access the GPU address and these will not be readable by CPU. If you use valgrind or address sanitizer, these accesses most likely will result in segfault.

To access individual scalar, you need to move them back to CPU (for now) and then access them. I think this might solve all the issues you have x_gpu.toCPU()[0, 0].

liuliu commented 1 year ago

Also, using debugPrint won't have this issue because it will move tensor back to CPU prior to print the result.

ghost commented 1 year ago

Yeah i figured. Maybe by default if i print single element, it should send it to CPU

ghost commented 1 year ago

Thanks a lot!