SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

Vector to vector binary operations cannot be done if the indexes contain both strings and symbols #501

Open thomasnaude opened 5 years ago

thomasnaude commented 5 years ago
/Users/tom/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/maths/arithmetic/vector.rb:95:in `sort': comparison of Symbol with String failed (ArgumentError)
from /Users/tom/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/maths/arithmetic/vector.rb:95:in `v2v_binary'
from /Users/tom/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/maths/arithmetic/vector.rb:82:in `binary_op'
from /Users/tom/.rbenv/versions/2.4.3/lib/ruby/gems/2.4.0/gems/daru-0.2.1/lib/daru/maths/arithmetic/vector.rb:18:in `/'

to reproduce :

first_vector = Daru::Vector.new([1, 2, 3], index: [:one, 'two', 'three'])
second_vector = Daru::Vector.new([1, 2, 3], index: [:one, 'two', 'three'])

first_vector / second_vector
Shekharrajak commented 5 years ago

Thanks @thomasnaude for pointing it out this issue. I think, to resolve this issue one has to look into, why sorting is done in binary operation of vectors here .

thomasnaude commented 5 years ago

You are right @Shekharrajak, that is definitely the problem. I looked into it and could not figure out why the sorting is done. But that certainly does not mean it is not needed.....

cyrillefr commented 4 years ago

Hello, @Shekharrajak , I looked at the code. In order to keep the sorting, we could add a simple:sort_by(&:to_s). By doing that, we ensure a sorting by string, and so the call succeed. I did a little test ans it works fine.

Shekharrajak commented 4 years ago

Sorry for late response. That's great @cyrillefr ,PR is welcome.

cyrillefr commented 3 years ago

Hello @Shekharrajak, @thomasnaude

Sorry for the delay.

I did not test the whole suite of specs and the simple trick I mentioned leads to fail another test. That one : it_should_behave_like correct macd in spec/maths/statistics/vector_spec.rb @ line 735

The macd method (in lib/daru/maths/statistics/vector) that in turns tries to: add ema(fast) - ema(slow) So method v2v_binary operation, other, opts={} is called and index is sorted. BUT, if sorted like that(sort_by(&:to_s)), it is not equal to the numeric sort, and (I don't know why) in turns, there are problems in the ema method at this line: start = @data.index { |i| !i.nil? }

Why ? Because with a sort(numeric in the test), all the nils are padded to right, and start is well defined. BUT, with a to_s sort, nils are spread everywhere in the array and a nil error is raised. There is a #FIXME line in the v2v_binary method about "why the sort?" I think I maybe have found out why the sort.

Nevertheless, I needed a sort_by(&:to_s), but ONLY when needed, so I had the idea to add a utility method, that would first try to sort, and only upon fail would string-sort. And, all the tests pass(including one previously pending).