I think theDataSet class might create a very bad memory layout for the samples. Armadillo uses Fortran Column major ordering, but we store the samples in rows of features:
indicate that we store the features in columns, and the samples in rows, so
column 1
column 2
column 3
...
column n
feature 1 for sample 1
feature 2 for sample 1
feature 3 for sample 1
...
feature n for sample 1
feature 1 for sample 2
feature 2 for sample 2
feature 3 for sample 2
...
feature n for sample 2
But armadillo uses column-major-ordering, so the data is stored like this:
feature 1 for sample 1, feature 1 for sample 2, ..., feature 1 for sample N,
feature 2 for sample 1, feature 2 for sample 2, ...
So there are huge gaps in memory between the features of a sample.
At this line the Armadillo Matrix is transposed, but according the documentation the .t() returns a transposed copy, which then also would have the wrong row/column ordering?
Or am i missing something here? Can someone explain?
I think the
DataSet
class might create a very bad memory layout for the samples. Armadillo uses Fortran Column major ordering, but we store the samples in rows of features:These lines in
Dataset.h
indicate that we store the features in columns, and the samples in rows, so
But armadillo uses column-major-ordering, so the data is stored like this:
So there are huge gaps in memory between the features of a sample.
At this line the Armadillo Matrix is transposed, but according the documentation the
.t()
returns a transposed copy, which then also would have the wrong row/column ordering?Or am i missing something here? Can someone explain?