Closed axiqia closed 5 years ago
int param[] = {1,1,0,1,0,1,1,1,3686400,3686400,3686400,3686400,1,0,0,1,1,1,3686400,3686400,3686400,3686400,1};
std::vector<RealVector> onetest(param, param+23);
this creates a set of 23 vectors and param[] is interpreted as the sizes of those vectors. Just create a single RealVector.
RealVector single(23);
std::copy(param, param+23, single.begin());
unsigned int prediction = model(single);
int param[] = {1,1,0,1,0,1,1,1,3686400,3686400,3686400,3686400,1,0,0,1,1,1,3686400,3686400,3686400,3686400,1}; std::vector<RealVector> onetest(param, param+23);
this creates a set of 23 vectors and param[] is interpreted as the sizes of those vectors. Just create a single RealVector.
RealVector single(23); std::copy(param, param+23, single.begin()); unsigned int prediction = model(single);
It do work, thank you so mush. It needs to be pointed out that the model
return type is shark::blas::vector<double>
in this context.
But I have a question again. After many tests, I fond that the prediction value is never bigger than 20, so I print labels of dataTest
, all of them are between 10 - 20. It is really different from the original labels.
What happened?
The following code is written in reference to this.
RegressionDataset data;
importCSV(data, argv[1], FIRST_COLUMN, 1, ',');
cout << "data labes" << endl;
cout << data.labels() << endl;
RegressionDataset dataTest = splitAtElement(data, static_cast<std::size_t(0.8*data.numberOfElements()));
cout << "test label" << endl;
cout << dataTest.labels() << endl;
cout << "data_after" << endl;
cout << data.labels() << endl;
//labels of test data splitted from the import dataset
// the first column is the line number, the third column is label
14235 [1](16.064)
14236 [1](10.816)
14237 [1](11.552)
14238 [1](11.52)
14239 [1](11.392)
14240 [1](13.216)
14241 [1](11.264)
14242 [1](12.288)
14243 [1](12.896)
14244 [1](10.944)
14245 [1](11.488)
14246 [1](12.032)
14247 [1](23.872)
14248 [1](15.2)
14249 [1](16.736)
14250 [1](10.592)
14251 [1](10.912)
14252 [1](12.448)
14253 [1](14.848)
14254 [1](15.936)
14255 [1](16.192)
14256 [1](10.368)
14257 [1](10.592)
14258 [1](12.608)
14259 [1](14.912)
14260 [1](15.168)
14261 [1](15.584)
14262 [1](11.072)
14263 [1](10.752)
14264 [1](13.216)
//labes of data imported from csv
17099 [1](49.824)
17100 [1](173.472)
17101 [1](54.784)
17102 [1](49.44)
17103 [1](173.632)
17104 [1](48)
17105 [1](38.368)
17106 [1](93.536)
17107 [1](36.64)
17108 [1](28.512)
17109 [1](89.12)
17110 [1](34.4)
17111 [1](39.328)
17112 [1](89.472)
17113 [1](38.624)
17114 [1](30.496)
17115 [1](90.112)
17116 [1](36.416)
17117 [1](28.64)
17118 [1](89.024)
17119 [1](33.536)
17120 [1](32.768)
17121 [1](89.536)
17122 [1](28.064)
17123 [1](30.752)
17124 [1](58.048)
17125 [1](23.104)
I am really sorry to bother you so many times.
"It needs to be pointed out that the model return type is shark::blas::vector
is this now a different issue, just about splitAtElement? I have trouble understanding your print out. are those supposed to be the same values? you can test for yourself by checking whether the last elements of data before splitting are the same as the elements in test after splitting.
If this is the case, there is no bug in shark. I would give you the hint to use data.shuffle() before splitting, because your training dataset might have some type of order.
"It needs to be pointed out that the model return type is shark::blas::vector in this context." yeah thought you were doing classification and did not realize you were not using shark 4.0
is this now a different issue, just about splitAtElement? I have trouble understanding your print out. are those supposed to be the same values? you can test for yourself by checking whether the last elements of data before splitting are the same as the elements in test after splitting.
If this is the case, there is no bug in shark. I would give you the hint to use data.shuffle() before splitting, because your training dataset might have some type of order.
I got it, and It is my fault. Thank you so much :)
I want to use Random Forest Regression to inference only once, so I try to construct a data like bellow, but I get an exception.
details of data for train: number of data points,11383 input dimension, 23. Is this the right way to contruct data to predict?