huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.57k stars 924 forks source link

Tensor copy from noncontiguous tensor still make noncontiguous tensor #1930

Open yinqiwen opened 6 months ago

yinqiwen commented 6 months ago

while pytoch's clone return contiguous tensor

candle

#[test]
fn test_copy1() -> candle::Result<()> {
    let device = candle::Device::Cpu;

    let test = Tensor::zeros((2, 10), DType::U32, &device)?;
    let test1 = test.narrow(1, 0, 5)?;
    println!(
        "test1 shaep:{:?}, stride:{:?}, is_contiguous:{}",
        test1.shape(),
        test1.stride(),
        test1.is_contiguous()
    );

    let test2 = test1.copy()?;
    println!(
        "test2 shaep:{:?}, stride:{:?}, is_contiguous:{}",
        test2.shape(),
        test2.stride(),
        test2.is_contiguous()
    );
    Ok(())
}

running 1 test
test1 shaep:[2, 5], stride:[10, 1], is_contiguous:false
test2 shaep:[2, 5], stride:[10, 1], is_contiguous:false
test tensor::copy::test_copy1 ... ok

pytorch

a = torch.rand(2, 10,dtype=torch.float32)
test1 = a.narrow(1, 0, 5)
print("test1 shape:", test1.shape, ", stride:", test1.stride(),"is_contiguous:", test1.is_contiguous())
test2 = test1.clone()
print("test2 shape:", test2.shape, ", stride:", test2.stride(),"is_contiguous:", test2.is_contiguous())

test1 shape: torch.Size([2, 5]) , stride: (10, 1) is_contiguous: False
test2 shape: torch.Size([2, 5]) , stride: (5, 1) is_contiguous: True
LaurentMazare commented 6 months ago

Use .contiguous() to get a contgiuous tensor. In rust, the clone method is a bit special as its a very common trait from the standard library that is supposed to be cheap so the current implementation is just bumping a ref count rather than doing a full copy.

yinqiwen commented 6 months ago

Use .contiguous() would use much more device memory than need. IMO, use kernels like ucopy is better to do that copy for noncontiguous tensors.