Codewars: Soundex - Githubissues

jransome commented 7 years ago

Hi all, Antoine and I are having an issue trying to replace the first character of a word in an array of words. After saving the first character in first_letter, we are now trying to reassign it using word[0] = first_letter (after substituting certain characters with numbers and removing adjacent duplicate numbers), but this is instead replacing the entire word with just the first_letter character. Any ideas on what's going wrong? Thanks!

def soundex(names)
  words = names.split(' ')
  words.map! do |word|
    first_letter = word[0]
    word.downcase!
    word.tr!('hw','')
    word.tr!('bfpv','1')
    word.tr!('cgjkqsxz','2')
    word.tr!('dt','3')
    word.tr!('l','4')
    word.tr!('mn','5')
    word.tr!('r','6')
    word.tr!('aeiouy','')
    word = word.chars.select.with_index{|c, i| c != word[i+1] }.join
    word[0] = first_letter
  end
end

paulmillen commented 7 years ago

It looks like you're replacing word with a single character when you're using select. After this, word will only be that one character which will be overwritten by the first_letter.

oleglukyanov commented 7 years ago

From my experience .map may behave a bit odd when you treat it as .each e.g. if you start doing something that is not mapping inside of it.

oleglukyanov commented 7 years ago

To illustrate the point:

arr = ['Alex','Nina','Oleg']

arr.map! do |name|
  name[0] = 'A'
end

p arr # ["A", "A", "A"]

arr = ['Alex','Nina','Oleg']

arr.each_index do |i|
  arr[i][0] = 'A'
end

p arr # ["Alex", "Aina", "Aleg"]

oleglukyanov commented 7 years ago

... it would also work with .map if we modify the first example this way:

arr = ['Alex','Nina','Oleg']

arr.map! do |name|
  name[0] = 'A'
  name
end

p arr # ["Alex", "Aina", "Aleg"]

So in your code you can do:

def soundex(names)
  words = names.split(' ')
  words.map! do |word|
    first_letter = word[0]
    word.downcase!
    word.tr!('hw','')
    word.tr!('bfpv','1')
    word.tr!('cgjkqsxz','2')
    word.tr!('dt','3')
    word.tr!('l','4')
    word.tr!('mn','5')
    word.tr!('r','6')
    word.tr!('aeiouy','')
    word = word.chars.select.with_index{|c, i| c != word[i+1] }.join
    word[0] = first_letter
    word
  end
end

And it should work unless there are other bugs ;)

jransome commented 7 years ago

Thanks both! I was half suspecting it had something to do with the use of map! vs each. I tried adding word to the last line of the block and it works, but I'm not sure why - why would doing that alter the value of word? Is it treated as a return value where the last expression is returned?

oleglukyanov commented 7 years ago

That's a very good question @jransome. It seems map and collect are the same. But map and map! are not.

Honestly your question opened several unexpected end puzzling things to me. Here's some more experiments, no conclusions:

arr = ['Alex','Nina','Oleg']
arr.map! do |name|
  'Ann'
end
p arr # => ["Ann", "Ann", "Ann"] – Expected

arr = ['Alex','Nina','Oleg']
arr.map! do |name|
  name[0] = 'a'
end
p arr # => ["a", "a", "a"] – Unexpected

arr = ['Alex','Nina','Oleg']
arr.map do |name|
  name = 'Ann'
end
p arr # => ["Alex", "Nina", "Oleg"] – Expected

arr = ['Alex','Nina','Oleg']
arr.map do |name|
  name[0] = 'a'
end
p arr # => ["alex", "aina", "aleg"] – Very much unexpected

aballal commented 7 years ago

Each returns the array it works on while map returns the results array.

In your scenario (2) the result of map is ["a","a","a"] and since it is being stored back into arr by use of map!, that's what you get as arr.

In scenario (4) each array element gets modified because of name[0] = 'a' and the value becomes ["alex","aina","aleg"]. This is regular string operation and has nothing to do with map. Map returns ["a","a","a"] but that isn't used anywhere and you aren't using ! so arr doesn't get overwritten either.

I find it easier to not use map when I don't need the resulting array, and I use each instead in those scenarios.

Please run the code below and notice the difference in output:

arr = ['Alex','Nina','Oleg']
p arr
new_arr = arr.map do |name|
  name[0] = 'a'
end
p arr
p new_arr

arr = ['Alex','Nina','Oleg']
p arr
new_arr = arr.each do |name|
  name[0] = 'a'
end
p arr
p new_arr

Output: ["Alex", "Nina", "Oleg"] ["alex", "aina", "aleg"] ["a", "a", "a"] ["Alex", "Nina", "Oleg"] ["alex", "aina", "aleg"] ["alex", "aina", "aleg"]

oleglukyanov commented 7 years ago

Hi @aballal, thanks for bringing in some clarity! I totally admit it's much more likely about me being stupid rather than about Ruby issues, but I still can't get an understanding of fundamentals behind that.

Another experiment, now just map alone:

arr = ['Foo','Bar','Tango']
arr.map do |name|
  name[0] = 'A'
end
p arr # => ["Aoo", "Aar", "Aango"]

arr = ['Foo','Bar','Tango']
arr.map do |name|
  name = 'A'
end
p arr # => ["Foo", "Bar", "Tango"]

Can you tell why is that? How name[0] = 'A' inside the block is fundamentally different from name = 'A'? Aren't they both just regular string operations?

aballal commented 7 years ago

Wow, that's an excellent catch! I have always been under the impression that the piped in element and the ith element of the array are the same objects. I printed the object ids and that seems to agree with my assumption.

arr = ['Foo','Bar','Tango']
i = 0
arr.map do |name|
  puts "Arr[i]  : #{arr[i].object_id} #{arr[i]}"
  puts "Name    : #{name.object_id} #{name}"
  name = 'A'
  #arr[i] = 'A'
  i= i+1
end
p arr

Output: Arr[i] : 70228042449120 Foo Name : 70228042449120 Foo Arr[i] : 70228042449100 Bar Name : 70228042449100 Bar Arr[i] : 70228042449080 Tango Name : 70228042449080 Tango ["Foo", "Bar", "Tango"]

Now comment name = 'A' and uncomment arr[i] = 'A' and the result is ["A", "A", "A"]!

I can't understand why that is happening because to me ["A", "A", "A"] seems like the expected output in both scenarios. May be it has something to do with the fact that []= is an exception and it modifies the receiver (similar to !) but still can't get over the fact that object ids are same and yet results are different.

jransome commented 7 years ago

Perhaps it's because the name variable simply points to the object at Arr[i], and assigning it a different value therefore doesn't affect the value of Arr[i] ?

ie.

arr = ['Foo','Bar','Tango']
i = 0
new_arr = arr.map do |name|
  puts "Arr[i]  : #{arr[i].object_id} #{arr[i]}"
  puts "Name    : #{name.object_id} #{name}"
  name = 'A'
  #arr[i] = 'A'
  puts "Arr[i]  : #{arr[i].object_id} #{arr[i]}"
  puts "Name    : #{name.object_id} #{name}"
  puts
  i= i+1
end
p arr
p new_arr

results in: Arr[i] : 25670568 Foo Name : 25670568 Foo Arr[i] : 25670568 Foo Name : 25670424 A

Arr[i] : 25670556 Bar Name : 25670556 Bar Arr[i] : 25670556 Bar Name : 25670292 A

Arr[i] : 25670544 Tango Name : 25670544 Tango Arr[i] : 25670544 Tango Name : 25670160 A

["Foo", "Bar", "Tango"] [1, 2, 3]

Also isn't it odd that name is set to equal the result of the last expression evaluated without an explicit assignment?

dearshrewdwit commented 7 years ago

Really interesting conversation! See what you make about this: http://mattcampbell.nyc/2013/10/11/what-exactly-is-between-those-pipes/

oleglukyanov commented 7 years ago

@dearshrewdwit thanks for joining our chat :-) And for the link. I've got it now! Previously I ignored the fact that everything including assignment has its own return. And this was the key reason for confusion in my case.

oleglukyanov commented 7 years ago

... as for the difference between name[0] = 'A' and name = 'A' in the block, it seems that name[0] = 'A' modifies the original object while name = 'A' creates a new one:

name = 'Mail'
puts "Hi, I'm #{name}, my id: #{name.object_id}"
name[0] = 'F'
puts "Hi, I'm #{name}, my id: #{name.object_id}"
name = 'Bail'
puts "Hi, I'm #{name}, my id: #{name.object_id}"

# => Hi, I'm Mail, my id: 70281926577500
# => Hi, I'm Fail, my id: 70281926577500
# => Hi, I'm Bail, my id: 70281926577340

So it's just good to know. Also the .replace method now makes much more sense to me.

jransome commented 7 years ago

Hmm that's what I've found as well...

names1 = ['Foo','Bar','Tango']
names2 = names1
names3 = names2

puts "names1 = #{names1}, #{names1.object_id}"
puts "names2 = #{names2}, #{names2.object_id}"
puts "names3 = #{names3}, #{names3.object_id}"

names3[0] = "Woo"

puts "names1 = #{names1}, #{names1.object_id}"
puts "names2 = #{names2}, #{names2.object_id}"
puts "names3 = #{names3}, #{names3.object_id}"

# => names1 = ["Foo", "Bar", "Tango"], 25702872
# => names2 = ["Foo", "Bar", "Tango"], 25702872
# => names3 = ["Foo", "Bar", "Tango"], 25702872
# => names1 = ["Woo", "Bar", "Tango"], 25702872
# => names2 = ["Woo", "Bar", "Tango"], 25702872
# => names3 = ["Woo", "Bar", "Tango"], 25702872

So names3[0] = "Woo" not only changes names3, but names2 and names1 as well, whereas names3 = ['Woo','Bar','Tango'] only changes names3, which seems counter-intuitive. However, according to this (under "Classes, Objects, and Variables") this doesn't create as many problems as I thought it would. Calling .dup will make Ruby duplicate an object for a new variable eg names2 = names1.dup and would be a way to get around this, something which seems to be called automatically behind the scenes when simply assigning a variable a new value as with names3 = ['Woo','Bar','Tango']

oleglukyanov commented 7 years ago

@jransome that's actually quite predictable because enumerables in Ruby are all reference types. Which means names1, names2 and names3 in your code are basically three pointers all pointing to the same area in memory where the actual data is.

names2 = names1 # Creates another reference to the existing data.
names2 = Array.new(names1) # Creates a new copy of that data.

aballal commented 7 years ago

@jransome - My cause for confusion and your cause for confusion are possibly the same. I was treating a variable to be something that points to a memory location and hence expected assignment operation to change the value of that memory location. However, in Ruby a variable points to an object. A new assignment simply makes it point to a new object.

names1 = ['Foo','Bar','Tango']
#names1 points to the object ['Foo','Bar','Tango']
names2 = names1
#We try pointing names2 to names 1 but that in turn points it to the object ['Foo','Bar','Tango']
names3 = names2
#We try pointing names3 to names2 but that in turn points it to the object ['Foo','Bar','Tango']
names3[0] = "Woo"
#This changes the object ['Foo','Bar','Tango'] itself as '[]=' modifies the receiver itself which in this case is the object ['Foo','Bar','Tango']
names3 = ['Woo','Bar','Tango']
#This on the other hand creates a new object ['Woo','Bar','Tango'] and names3 points to it
#names1 and names2 still point to the first object ['Foo','Bar','Tango']

I think my confusion came from a C programming kind of mindset where each variable has its own memory location (so an assignment changes data in that location). But post this issue, and after trying something similar in C and seeing the difference, that chapter from Chris Pine's book (Variables and assignement) seems to make complete sense.

jransome commented 7 years ago

I think you're right about us having the same preconception, @aballal . Anyway, thank you all for your contributions, I think it's helped to clarify a few things!

makersacademy / problem-solving

Codewars: Soundex #93