mikera / core.matrix

core.matrix : Multi-dimensional array programming API for Clojure
Other
700 stars 113 forks source link

pm, get-shape are confused by cells containing sequences in NDArray #361

Open mars0i opened 1 year ago

mars0i commented 1 year ago

I don't know whether this is a problem worth solving--I think it's probably not--but I thought it couldn't hurt to report it.

Use case: I want to use matrices to store non-numeric data representing elements in a spatial field. There will be 100s of thousands of locations represented--maybe more--and I need to be able to index into the locations very quickly. I've been using core.matrix NDarrays for this purpose. In particular, I'm storing sequences of pairs of numbers in the matrix cells. Currently, I initialize each cell to nil, and then conj new data (a Clojure vector pair) onto whatever's in the cell. So cells either contain nil or a clojure.lang.PersistentList.

To examine the effects of my code, I'm printing the matrices using pm. The problem is that when there is a PersistentList in a cell in the first column of the matrix, I get an error:

  (def m (mx/matrix :ndarray [[nil nil nil] [nil nil nil] [nil nil nil]]))
  (mx/mset! m 0 1 (list :a))
  (mx/pm m) ; succeeds
  (mx/mset! m 0 0 (list :a))
  (mx/shape m) ; => [3 3]
  (mx/pm m)
; eval (effective-root-form): (mx/pm m)
; (err) Execution error (ExceptionInfo) at clojure.core.matrix.impl.persistent-vector/eval22141$fn (persistent_vector.cljc:571).
; (err) Can't convert to persistent vector array: inconsistent shape.

  (mx/mset! m 0 0 nil) ; This fixes the problem

  ;; These also generate the error:
  (mx/mset! m 1 0 (range 1))
  (mx/mset! m 2 0 [:a])
pm
  ;; These don't cause problems:
  (mx/mset! m 0 0 #{:a})
  (mx/mset! m 0 0 {:a 1})

I'm not sure I understand why this is happening. I think it occurs because get-shape recurses into the matrix and uses nth to figure out its dimensionality: https://github.com/mikera/core.matrix/blob/develop/src/main/clojure/clojure/core/matrix/impl/persistent_vector.cljc#L540 nth doesn't work on sets and maps.

I know that my use case is unusual, and I have a workaround: I can simply the cells with empty sets, and my problem is solved. (pm's output is pretty ugly that way, but I can convert the empty set cells to something else before printing with pm.

pm is only important in this case for debugging and development, anyway. At a later stage the contents of the matrix will be displayed using plotting functions.