ruby / csv

CSV Reading and Writing
https://ruby.github.io/csv/
BSD 2-Clause "Simplified" License
178 stars 113 forks source link

`CSV::Row` pattern matching `Symbol` assumption #246

Open baweaver opened 2 years ago

baweaver commented 2 years ago

There's an assumption made in the pattern matching code which will make it very hard to use, and that's the presumption of Symbol keys on CSV::Row.

If you were to make a PM interface and look at the keys:

class Testing
  def deconstruct_keys(keys)
    p(keys:)
    {}
  end
end
# => :deconstruct_keys

Testing.new in a: 1, b: 2, c: 3
# {:keys=>[:a, :b, :c]}
# => false

You will receive Symbol keys. Why is this a problem? Because if you are loading CSV data it will most likely be from a file or String which we can take directly from the documentation:

# Headers are part of data
data = CSV.parse(<<~ROWS, headers: true)
  Name,Department,Salary
  Bob,Engineering,1000
  Jane,Sales,2000
  John,Management,5000
ROWS

data.class
#=> CSV::Table

data.first
#=> #<CSV::Row "Name":"Bob" "Department":"Engineering" "Salary":"1000">

If we were to test this against pattern matching (and adding a keys debugging statement):

data.select { _1 in Name: /^J/ }
{:keys=>[:Name]}
{:keys=>[:Name]}
{:keys=>[:Name]}
 => []

As it exists today this feature cannot be used except for manually created data like in the tests which will not be the experience of end users who either parse from a String or File.

I would like to propose that we modify the method to treat the keys as keyword arguments, rather than as Symbols, which will allow this feature to be used for common usecases:

class CSV::Row
  def deconstruct_keys(keys)
    if keys.nil?
      to_h.transform_keys(&:to_sym)
    else
      keys.to_h do |key|
        value = if self.key?(key)
          self[key]
        elsif self.key?(key.to_s)
          self[key.to_s]
        else
          nil
        end

        [key, value]
      end
    end
  end
end

This does, in a sense, conflate Symbol and String keys, but String keys are impossible in pattern matching:

CSV::Row.new(%w(A B), [1, 2], true) in "A" => 1
/Users/lemur/.rvm/rubies/ruby-3.1.2/lib/ruby/3.1.0/irb/workspace.rb:119:in `eval': (irb):14: syntax error, unexpected integer literal, expecting local variable or method (SyntaxError)
...A B), [1, 2], true) in "A" => 1

So the options are either have an interface which will only work with test data and users who manually create Symbol keys (which is a minority case), or consider a keyword-like approach that can handle the ambiguity of keys types.

I believe this is a reasonable assumption of coercion to make, as Ruby has done this in other cases with converting between Symbol and String whenever it is considered reasonable input from the user. I believe this would qualify as reasonable user input.

If this is of interest I can submit a PR, but wanted to run the idea by people first.

kou commented 2 years ago

Users can use header_converters: :symbol:

require "csv"

data = CSV.parse(<<~ROWS, headers: true, header_converters: :symbol)
  Name,Department,Salary
  Bob,Engineering,1000
  Jane,Sales,2000
  John,Management,5000
ROWS
pp data.select { _1 in name: /^J/ }
[#<CSV::Row name:"Jane" department:"Sales" salary:"2000">,
 #<CSV::Row name:"John" department:"Management" salary:"5000">]