PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.2k stars 5.57k forks source link

capi, 数据输入格式 #8601

Closed hyo009 closed 6 years ago

hyo009 commented 6 years ago

python做infer没问题,换成capi后报错

I0227 05:59:35.251722   445 Util.cpp:166] commandline:  --use_gpu=false 
I0227 05:59:35.256510   445 GradientMachine.cpp:83] Loading parameters from ./model/dnn_params_30min/
F0227 05:59:35.282200   445 TableProjection.cpp:39] Check failed: in_->ids
*** Check failure stack trace: ***
    @     0x7ff487fe876d  google::LogMessage::Fail()
    @     0x7ff487fec21c  google::LogMessage::SendToLog()
    @     0x7ff487fe8293  google::LogMessage::Flush()
    @     0x7ff487fed72e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7ff48801c6d5  paddle::TableProjection::forward()
    @     0x7ff488081db9  paddle::MixedLayer::forward()
    @     0x7ff48814c4bd  paddle::NeuralNetwork::forward()
    @     0x7ff487fe4676  paddle_gradient_machine_forward
    @           0x402cc1  main
    @     0x7ff487398bd5  __libc_start_main
    @           0x401859  (unknown)
    @              (nil)  (unknown)

模型data leyer定义:

data_layer_dict = {}
    for iter in ['cur_stream','up_stream_1', 'down_stream_1']:
        data_stream = paddle.layer.data(
                name=iter,
                type=paddle.data_type.integer_value_sequence(2966))
        data_layer_dict[iter] = data_stream

data_stream = paddle.layer.data(
            name='vector',
            type=paddle.data_type.dense_vector(490))
data_layer_dict['vector'] = data_stream

在python做infer时cur_stream,up_stream_1和down_stream_1都是长度为43的list,vector为长度为490的list。

capi中测试数据读入方式:

paddle_arguments in_args = paddle_arguments_create_none();
CHECK(paddle_arguments_resize(in_args, 4));
int array_cur[43];
int array_up[43];
int array_down[43];
int array_vec[490];

...

paddle_ivector cur_stream = paddle_ivector_create(array_cur, sizeof(array_cur) / sizeof(int), false, false); 
paddle_ivector up_stream_1 = paddle_ivector_create(array_up, sizeof(array_up) / sizeof(int), false, false); 
paddle_ivector down_stream_1 = paddle_ivector_create(array_down, sizeof(array_down) / sizeof(int), false, false); 
paddle_ivector vector = paddle_ivector_create(array_vec, sizeof(array_vec) / sizeof(int), false, false); 

CHECK(paddle_arguments_set_value(in_args, 0, cur_stream));
CHECK(paddle_arguments_set_value(in_args, 1, up_stream_1));
CHECK(paddle_arguments_set_value(in_args, 2, down_stream_1));
CHECK(paddle_arguments_set_value(in_args, 3, vector));

paddle_arguments out_args = paddle_arguments_create_none();

CHECK(paddle_gradient_machine_forward(machine,
                                        in_args,
                                        out_args,
                                        false));
paddle_matrix prob = paddle_matrix_create_none();
CHECK(paddle_arguments_get_value(out_args, 0, prob));
lcy-seso commented 6 years ago

@请阅读一下这篇文档 http://www.paddlepaddle.org/docs/develop/documentation/zh/howto/capi/organization_of_the_inputs_cn.html ,四个data layer 前3个没有序列信息,第四个输入的类型是错误的。

hyo009 commented 6 years ago

前三个data layer对应

2018-02-27 6 55 59

仿照例子是否可以写成这样:

int seq_pos_array[] = {0}; 
    paddle_ivector cur_stream = paddle_ivector_create( seq_pos_array, sizeof(seq_pos_array) / sizeof(int), false, false);
    paddle_ivector up_stream = paddle_ivector_create( seq_pos_array, sizeof(seq_pos_array) / sizeof(int), false, false);
    paddle_ivector down_stream = paddle_ivector_create( seq_pos_array, sizeof(seq_pos_array) / sizeof(int), false, false);

请问测试数据如何加入data layer? 在原python模型中,我是把三个长度为43的整数list放入三个

data_stream = paddle.layer.data(
                name=iter,
                type=paddle.data_type.integer_value_sequence(2966))
lcy-seso commented 6 years ago
  1. sequence start position 不对。请阅读文档http://www.paddlepaddle.org/docs/develop/documentation/zh/howto/capi/organization_of_the_inputs_cn.html 中组织序列信息一节
  2. capi预测时,一个data layer 对应一个argument。如果原来有三个就对应创建三个argument。原来有一个就创建一个。
hyo009 commented 6 years ago

按照文档重新写了下

int STREAM_LEN=43;
int VECTOR_LEN=490;

CHECK(paddle_arguments_resize(in_args, 4));  //4个argument

// 读取要放入data layers的测试数据
int array_cur[STREAM_LEN];
int array_up[STREAM_LEN];
int array_down[STREAM_LEN];
float array_vec[VECTOR_LEN];

...

// 准备前三个data layer的输入
int seq_pos_array[] = {0,, sizeof(array_cur) / sizeof(int)}; 
paddle_ivector seq_pos = paddle_ivector_create( seq_pos_array, sizeof(seq_pos_array) / sizeof(int), false, false);

paddle_ivector cur_stream = paddle_ivector_create(array_cur, sizeof(array_cur) / sizeof(int), false, false); 
paddle_ivector up_stream_1 = paddle_ivector_create(array_up, sizeof(array_up) / sizeof(int), false, false); 
paddle_ivector down_stream_1 = paddle_ivector_create(array_down, sizeof(array_down) / sizeof(int), false, false);

CHECK(paddle_arguments_set_sequence_start_pos(in_args, 0, 0, seq_pos));
CHECK(paddle_arguments_set_sequence_start_pos(in_args, 1, 0, seq_pos));
CHECK(paddle_arguments_set_sequence_start_pos(in_args, 2, 0, seq_pos));

CHECK(paddle_arguments_set_value(in_args, 0, cur_stream));
CHECK(paddle_arguments_set_value(in_args, 1, up_stream_1));
CHECK(paddle_arguments_set_value(in_args, 2, down_stream_1));

// 准备最后一个data layer的输入
paddle_matrix vector = paddle_matrix_create(1, VECTOR_LEN, false);
paddle_real* arr_vec;
CHECK(paddle_matrix_get_row(vector, 0, &arr_vec));

for (int i = 0; i < VECTOR_LEN; ++i) {
      arr_vec[i] = array_vec[i];
}
CHECK(paddle_arguments_set_value(in_args, 3, vector));

还是同样的报错:

I0227 11:44:51.334859   603 Util.cpp:166] commandline:  --use_gpu=false 
I0227 11:44:51.338533   603 GradientMachine.cpp:83] Loading parameters from ./model/dnn_params_30min/
F0227 11:44:51.369438   603 TableProjection.cpp:39] Check failed: in_->ids 
*** Check failure stack trace: ***
    @     0x7f163001176d  google::LogMessage::Fail()
    @     0x7f163001521c  google::LogMessage::SendToLog()
    @     0x7f1630011293  google::LogMessage::Flush()
    @     0x7f163001672e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f16300456d5  paddle::TableProjection::forward()
    @     0x7f16300aadb9  paddle::MixedLayer::forward()
    @     0x7f16301754bd  paddle::NeuralNetwork::forward()
    @     0x7f163000d676  paddle_gradient_machine_forward
    @           0x40331f  main
    @     0x7f162f3c1bd5  __libc_start_main
    @           0x401a89  (unknown)
    @              (nil)  (unknown)
luotao1 commented 6 years ago

前三个是一维整型数组,根据文档需要使用paddle_arguments_set_ids来设置,而不是使用paddle_arguments_set_value

hyo009 commented 6 years ago

成功了,谢谢