Open behrica opened 1 month ago
these don't work neither:
(r.base/matrix (r/clj->r tensor))
(r.base/matrix tensor)
For reference, it does work like this:
(require '[tech.v3.tensor :as tens]
'[tech.v3.datatype :as dtt])
(-> tensor
tens/tensor->buffer
(r.base/matrix
:nrow (first (dtt/shape tensor))
:ncol (second (dtt/shape tensor))
)
r.base/t
)
Some remarks:
Transfer from Clojure to R should go through Java RServe library structures which I believe is an optimal route. Here is how it's done for TMD: https://github.com/scicloj/clojisr/blob/master/src/clojisr/v1/impl/clj_to_java.clj#L32-L48
Possible it can be done similarly for tensors as well.
Tensors in R can be represented as multidimensional arrays not matrices. Here is something done in the past (it's a transfer of flat data into 5d array): https://scicloj.github.io/clojisr/clojisr.v1.tutorials.dataset.html#matrices-arrays-multidimensional-arrays
Multidimensional arrays / tables in R are represented as flatten dataset on the Clojure side, like this 3d table: https://scicloj.github.io/clojisr/clojisr.v1.tutorials.dataset.html#table
Ok. I learned indeed that dtype tensors can in R be represented as
matrix
or array
Doing 'class` on a 3 D array in R gives;
> class(array(1:(3 * 4 * 5),dim=(c(3,4,5))))
[1] "array"
>
while on "matrix" it gives:
> class(matrix(c(1,2,3,4)))
[1] "matrix" "array"
Using "array" on 2D data gives as well a matrix:
> class(array(1:(3 * 4),dim=(c(3,4))))
[1] "matrix" "array"
Take a look at this line and below which converts multidimensional structure to flattened dataset. We can add another path to create tensors out of arrays. https://github.com/scicloj/clojisr/blob/master/src/clojisr/v1/impl/java_to_clj.clj#L94
yes, will do. To me this is specially unexpected / could be improved by return a proper tensor
(->
(r.base/array (range (* 3 4 5)) :dim [3 4 5])
(r/r->clj)
)
;; => _unnamed [15 5]:
;;
;; | :$col-0 | 1 | 2 | 3 | 4 |
;; |--------:|---:|---:|---:|---:|
;; | 1 | 0 | 3 | 6 | 9 |
;; | 1 | 1 | 4 | 7 | 10 |
;; | 1 | 2 | 5 | 8 | 11 |
;; | 2 | 12 | 15 | 18 | 21 |
;; | 2 | 13 | 16 | 19 | 22 |
;; | 2 | 14 | 17 | 20 | 23 |
;; | 3 | 24 | 27 | 30 | 33 |
;; | 3 | 25 | 28 | 31 | 34 |
;; | 3 | 26 | 29 | 32 | 35 |
;; | 4 | 36 | 39 | 42 | 45 |
;; | 4 | 37 | 40 | 43 | 46 |
;; | 4 | 38 | 41 | 44 | 47 |
;; | 5 | 48 | 51 | 54 | 57 |
;; | 5 | 49 | 52 | 55 | 58 |
;; | 5 | 50 | 53 | 56 | 59 |
it represents a R 3D arrays as 2 2D data frame, (with an extra column per dimension)
Yes, that was the idea. To make any nd-array into 2d dataset. I know this is not perfect solution. In that time tensors weren't available (or I was not aware of it)
Yes, that was the idea. To make any nd-array into 2d dataset. I know this is not perfect solution. In that time tensors weren't available (or I was not aware of it)
I see, I started a discussion in zulip , lets continue there.
It would be nice, if this would work:
and then
(It does something, but not the right thing)
or even better, that this does the right thing:
I think we have similar special handling for tech.v3.datasets, maybe we should do the same for tech.v3.tensor