Closed behrica closed 3 years ago
I have a fix and could do a PR.,
Now a some tests fail, as they assert the "wrong" behaviour. Like: https://github.com/zero-one-group/geni/blob/3ea740d3c863530439b520a67c5dddea39108a6f/test/zero_one/geni/ml_test.clj#L721
I needed to fix them as well.
Hi @behrica, yes, I remember implementing this and I did cut some corners. A PR to fix it would be great! Thank you!!
I have a problem with one test, which fails: In this test https://github.com/zero-one-group/geni/blob/d0bc4e734d88cda0d288eb25df92683c6313d8de/test/zero_one/geni/data_sources_test.clj#L235
The expected and found sparse vectors differ in the "size" (780 vs 692) The indices and values seem to be identical, so technical the sparse vector are the same, correct ?
But maybe there is a bug hidden somewhere.
Diffs: in [:features :size] expected 692, was 780
So this definitely passes:
(map #(get-in % [:features :indices]) (g/collect (libsvm-df))) => (map #(get-in % [:features :indices]) (g/collect read-df))
(map #(get-in % [:features :values]) (g/collect (libsvm-df))) => (map #(get-in % [:features :values]) (g/collect read-df))
But why are the sizes different, I don't know.
Can be closed.
a SparseVector has three pieces of information:
Current code here: https://github.com/zero-one-group/geni/blob/5b570b5db4413ea8152232b66e9aba7905cf4273/src/clojure/zero_one/geni/interop.clj#L142
only collects values, but we need all three:
Maybe a map with all three should be returned instead