Closed chavinlo closed 4 months ago
at first glance {1, 1, 297328, 1} should probably be {297328,1,1,1}
There are some examples of ggml_conv_1d
in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624
at first glance {1, 1, 297328, 1} should probably be {297328,1,1,1}
There are some examples of
ggml_conv_1d
in tortoise.cpp https://github.com/balisujohn/tortoise.cpp/blob/b9bb8771c3e3fa8ccb615d24b59b42edf15ca2fa/main.cpp#L3624
Thanks, now the im2col
tensor has a shape of {10, 59464, 1, 1} which is closer to pytorch's output. But it then returns a empty (full of -431602080.
) on the ggml_mul_mat
call (L6472). What exactly is the first tensor of mul_mat supposed to be? because im2col initializes a tensor with the shapes mentioned before but no value inside of it. And the a
and b
tensors are only stored on result->src[0] = a;
and result->src[1] = b;
respectively, right? (L6607)
tldr tensor a
(what's returned from im2col
) of mul_mat
's arguments inside ggml_conv_1d
is empty hence the output of conv1d will fail
whats the output shape of ggml_conv_1d
with these arguments?
whats the output shape of
ggml_conv_1d
with these arguments?
The output shape of ggml_conv_1d
is {59464, 512, 1, 1}
. However, the data of the tensor is empty.
It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does ggml_conv_1d
on your inputs instead of the current operation.
It would be helpful if you can produce a reproducible example of the error; you should be able to fork ggml and modify https://github.com/ggerganov/ggml/blob/master/examples/simple/simple-backend.cpp so it does
ggml_conv_1d
on your inputs instead of the current operation.
Sure, heres the tensors.pt
file needed for the scripts below (it only contains the weight of a conv1d and a test input): https://huggingface.co/chavinlo/ggmltest/resolve/main/tensors.pt?download=true
Pytorch and save to GGUF:
import torch
import torch.nn as nn
import gguf
tensors = torch.load("tensors.pt")
conv = nn.Conv1d(1, 512, 10, 5, bias=False)
input_tensor = tensors["input"]
conv.weight = tensors["weight"]
x = conv(input_tensor)
print(x)
print(x.shape)
print(input_tensor.shape)
print(conv.weight.shape)
"""
Output should be:
tensor([[[-3.6160e-02, -2.8281e-02, 1.1107e-02, ..., 3.0684e-02,
-2.2186e-02, -6.3259e-03],
[ 7.9663e-02, 3.0687e-02, -5.0905e-02, ..., -1.9161e-02,
3.1029e-02, 1.5162e-02],
[ 2.3205e-01, 1.8175e-01, -1.0812e-01, ..., -2.3548e-02,
3.9255e-02, 1.1151e-01],
...,
[ 8.6802e-04, 8.3316e-04, 3.2947e-04, ..., -2.9786e-03,
6.5938e-03, 1.1510e-02],
[ 1.6648e-02, 2.3425e-02, -7.5188e-03, ..., 8.7883e-03,
4.2063e-03, 1.8971e-02],
[-2.4058e-01, 3.3975e-01, 2.8910e-01, ..., -1.3100e-01,
-1.3514e-01, 1.4614e-01]]])
torch.Size([1, 512, 59464])
torch.Size([1, 1, 297328])
torch.Size([512, 1, 10])
"""
gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()
GGML C++:
#include <iostream>
#include "ggml.h"
int main()
{
std::string fname = "tensors.gguf";
// ##### GGML Context #####
static size_t buf_size = 1024 * 1024 * 128;
static void* buf = malloc(buf_size);
struct ggml_init_params ggml_params = {
/*.mem_size =*/ buf_size,
/*.mem_buffer =*/ buf,
/*.no_alloc =*/ false,
};
struct ggml_context* ggml_ctx = ggml_init(ggml_params);
// %%%%%%%%%%
// ##### GGUF Model Loading Context #####
struct ggml_context* ggml_loader_ctx;
struct gguf_init_params gguf_params = {
/*.no_alloc =*/ false,
/*.ctx =*/ & ggml_loader_ctx,
};
gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
// %%%%%%%%%%
ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");
ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);
std::cout << &output;
}
I had to create two contexts because when calling gguf_init_from_file
the buf_size would shrink to 1mb.
The C++ code returns the same results as the script I was using before. Same output shape {59464, 512, 1, 1}
yet no data.
Also about the ggml forking... you mean implementing ggml_conv_1d myself or...?
here's an example of what I'm talking about: https://github.com/balisujohn/ggml-get-rows-error/commit/acc02592ac953a41ea0f7e979e06b87f725f231d when I created a reproducible example an error with ggml_get_rows
.You can also upload the .pt file and .gguf files into the repository to make reproducing the error easier.
I didn't see you had included the tensors.pt (I'll see if I can reproduce the error)
I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand
or any variant of ggml_graph_compute
anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.
I am not an expert in ggml runtime variants, but I find it extremely suspicious that you are not calling either ggml_build_forward_expand or any variant of ggml_graph_compute anywhere. It looks like you are just declaring the computational graph but not actually activating the computation.
Added ggml_build_forward_expand
and ggml_graph_compute_with_ctx
and got an error on the ggml_compute_forward_im2col_f16
function (L14383) in the GGML_ASSERT(src0->type == GGML_TYPE_F16);
(L14390) check.
So then I changed the weight type from float32 to float16 before saving to GGUF, reloaded it, and finally it returned a non-empty tensor. Does this mean that the weights have to always be in float16 precision? because even the ggml_compute_forward_im2col_f32
function (L14306) makes the same check for the weights to be in float16.
Not that big of an error, but the accuracy of the output tensor is a little bit off: GGML output's first value is -0.0361655578
, while torch gives -0.0361604653
. The shape is correct though, {59464, 512, 1, 1}
.
Heres the updated C++/Pytorch code if necessary:
#include <iostream>
#include "ggml.h"
int main()
{
std::string fname = "tensors.gguf";
// ##### GGML Context #####
static size_t buf_size = 1024 * 1024 * 128;
static void* buf = malloc(buf_size);
struct ggml_init_params ggml_params = {
/*.mem_size =*/ buf_size,
/*.mem_buffer =*/ buf,
/*.no_alloc =*/ false,
};
struct ggml_context* ggml_ctx = ggml_init(ggml_params);
// %%%%% End of Inference GGML Context %%%%%
// ##### GGUF Model Loading Context #####
struct ggml_context* ggml_loader_ctx;
struct gguf_init_params gguf_params = {
/*.no_alloc =*/ false,
/*.ctx =*/ & ggml_loader_ctx,
};
gguf_context* gguf_ctx = gguf_init_from_file(fname.c_str(), gguf_params);
// %%%%% End of GGUF Model Loading Context %%%%%
struct ggml_cgraph* gf = ggml_new_graph(ggml_ctx);
ggml_tensor* weight_tensor = ggml_get_tensor(ggml_loader_ctx, "weight");
ggml_tensor* input_tensor = ggml_get_tensor(ggml_loader_ctx, "input");
ggml_tensor* output = ggml_conv_1d(ggml_ctx, weight_tensor, input_tensor, 5, 0, 1);
ggml_build_forward_expand(gf, output);
ggml_graph_compute_with_ctx(ggml_ctx, gf, 1);
std::cout << &output->data;
}
import torch
import torch.nn as nn
import gguf
tensors = torch.load("tensors.pt")
conv = nn.Conv1d(1, 512, 10, 5, bias=False)
input_tensor = tensors["input"]
conv.weight = tensors["weight"]
x = conv(input_tensor)
# to save
conv = conv.to(torch.float16)
print(x)
print(input_tensor)
print(conv.weight)
print("output shape:", x.shape)
print("input shape:", input_tensor.shape)
print("weight shape:", conv.weight.shape)
print("input dtype:", input_tensor.dtype)
print("weight dtype:", conv.weight.dtype)
"""
Output should be:
tensor([[[-3.6160e-02, -2.8281e-02, 1.1107e-02, ..., 3.0684e-02,
-2.2186e-02, -6.3259e-03],
[ 7.9663e-02, 3.0687e-02, -5.0905e-02, ..., -1.9161e-02,
3.1029e-02, 1.5162e-02],
[ 2.3205e-01, 1.8175e-01, -1.0812e-01, ..., -2.3548e-02,
3.9255e-02, 1.1151e-01],
...,
[ 8.6802e-04, 8.3316e-04, 3.2947e-04, ..., -2.9786e-03,
6.5938e-03, 1.1510e-02],
[ 1.6648e-02, 2.3425e-02, -7.5188e-03, ..., 8.7883e-03,
4.2063e-03, 1.8971e-02],
[-2.4058e-01, 3.3975e-01, 2.8910e-01, ..., -1.3100e-01,
-1.3514e-01, 1.4614e-01]]])
tensor([[[ 0.3720, 0.3385, 0.2953, ..., -0.0872, -0.1079, -0.1487]]])
Parameter containing:
tensor([[[-0.0186, 0.2178, -0.1289, ..., -0.0457, 0.1654, 0.1256]],
[[-0.1410, 0.2072, 0.1740, ..., -0.0887, -0.0700, -0.0103]],
[[ 0.2450, 0.2239, 0.1588, ..., -0.1426, -0.1188, 0.1205]],
...,
[[ 0.0122, -0.0559, 0.1382, ..., -0.2484, 0.1337, -0.0367]],
[[ 0.1155, 0.0986, -0.0650, ..., -0.1870, -0.0693, 0.0563]],
[[-0.1095, -0.1289, -0.2644, ..., -0.2622, -0.0640, -0.0243]]],
dtype=torch.float16)
output shape: torch.Size([1, 512, 59464])
input shape: torch.Size([1, 1, 297328])
weight shape: torch.Size([512, 1, 10])
input dtype: torch.float32
weight dtype: torch.float16
"""
gguf_writer = gguf.GGUFWriter("tensors.gguf", "test0")
gguf_writer.add_tensor("weight", conv.weight.numpy(), conv.weight.numpy().shape)
gguf_writer.add_tensor("input", input_tensor.numpy(), input_tensor.numpy().shape)
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()
Also, thanks for your help, I really appreciate it.
The weights do always have to be float16, the slight numerical difference from torch is unsurprising, and no problem, happy to help.
Hello, I am trying to implement a model that makes uses of nn.conv1d in pytorch. I don't have much experience with C++ but I've read the MNIST examples and part of stable-diffusion.cpp. However, I can't seem to find many examples of ggml_conv_1d.
I have tried this (the following is just simplified):
And this would be the pytorch equivalent:
I got the weight shape order from this code: https://github.com/PABannier/encodec.cpp/blob/main/encodec.cpp#L755 However, the output is just a bunch of NaNs (I think, because its all "Ï" on visual studio memory debug). I tried diving deeper and I think the issue is on
ggml_im2col
whereconst int64_t OW = ggml_calc_conv_output_size(b->ne[0], a->ne[0], s0, p0, d0);
calculates a 0 value (which is wrong?). While debugging I think this is what it was calculating:((1 + 2 × 1 − 1 × (10 − 1) − 1) / 5 + 1)
which equals-0.4
?I've tried finding a solution but theres barely any documentation and I have already stared at the source code for hours and don't know what to do
This is the code I use to save the weights into GGUF format if necessary: