Roo implements read access for all common spreadsheet types. It can handle:
Install as a gem
$ gem install roo
Or add it to your Gemfile
gem "roo", "~> 2.10.0"
You can use the Roo::Spreadsheet
class so roo
automatically detects which parser class to use for you.
require 'roo'
file_name = './new_prices.xlsx'
xlsx = Roo::Spreadsheet.open(file_name)
xlsx.info
# => Returns basic info about the spreadsheet file
Roo::Spreadsheet.open
can accept both string paths and File
instances. Also, you can provide the extension of the file as an option:
require 'roo'
file_name = './rails_temp_upload'
xlsx = Roo::Spreadsheet.open(file_name, extension: :xlsx)
xlsx.info
# => Returns basic info about the spreadsheet file
On the other hand, if you know what the file extension is, you can use the specific parser class instead:
require 'roo'
xlsx = Roo::Excelx.new("./new_prices.xlsx")
xlsx.info
# => Returns basic info about the spreadsheet file
ods.sheets
# => ['Info', 'Sheet 2', 'Sheet 3'] # an Array of sheet names in the workbook
ods.sheet('Info').row(1)
ods.sheet(0).row(1)
# Set the last sheet as the default sheet.
ods.default_sheet = ods.sheets.last
ods.default_sheet = ods.sheets[2]
ods.default_sheet = 'Sheet 3'
# Iterate through each sheet
ods.each_with_pagename do |name, sheet|
p sheet.row(1)
end
Roo uses Excel's numbering for rows, columns and cells, so 1
is the first index, not 0
as it is in an Array
sheet.row(1)
# returns the first row of the spreadsheet.
sheet.column(1)
# returns the first column of the spreadsheet.
Almost all methods have an optional argument sheet
. If this parameter is omitted, the default_sheet will be used.
sheet.first_row(sheet.sheets[0])
# => 1 # the number of the first row
sheet.last_row
# => 42 # the number of the last row
sheet.first_column
# => 1 # the number of the first column
sheet.last_column
# => 10 # the number of the last column
You can access the top-left cell in the following ways
sheet.cell(1,1)
sheet.cell('A',1)
sheet.cell(1,'A')
sheet.a1
# Access the second sheet's top-left cell.
sheet.cell(1,'A',sheet.sheets[1])
Use each
to iterate over each row.
If each is given a hash with the names of some columns, then each will generate a hash with the columns supplied for each row.
sheet.each(id: 'ID', name: 'FULL_NAME') do |hash|
puts hash.inspect
# => { id: 1, name: 'John Smith' }
end
Use sheet.parse
to return an array of rows. Column names can be a String
or a Regexp
.
sheet.parse(id: /UPC|SKU/, qty: /ATS*\sATP\s*QTY\z/)
# => [{:id => 727880013358, :qty => 12}, ...]
Use the :headers
option to include the header row in the parsed content.
sheet.parse(headers: true)
Use the :header_search
option to locate the header row and assign the header names.
sheet.parse(header_search: [/UPC*SKU/,/ATS*\sATP\s*QTY\z/])
Use the :clean
option to strip out control characters and surrounding white space.
sheet.parse(clean: true)
When opening the file you can add a hash of options.
If you open a document with merged cells and do not want to end up with nil values for the rows after the first one.
xlsx = Roo::Excelx.new('./roo_error.xlsx', {:expand_merged_ranges => true})
Roo has the ability to export sheets using the following formats. It
will only export the default_sheet
.
sheet.to_csv
sheet.to_matrix
sheet.to_xml
sheet.to_yaml
Specify the file as default argument for #to_csv
:
sheet.to_csv(File.new("/dev/null"))
specify the custom separator:
sheet.to_csv(separator: ":") # "," using by default
Stream rows from an Excelx spreadsheet.
xlsx = Roo::Excelx.new("./test_data/test_small.xlsx")
xlsx.each_row_streaming do |row|
puts row.inspect # Array of Excelx::Cell objects
end
By default blank cells will be excluded from the array. To keep them, use the option pad_cells = true. (They will be set to nil in the array)
xlsx.each_row_streaming(pad_cells: true) do |row|
puts row.inspect # Array of Excelx::Cell objects
end
To stream only some of the rows, you can use the max_rows
and offset
options.
xlsx.each_row_streaming(offset: 1) do |row| # Will exclude first (inevitably header) row
puts row.inspect # Array of Excelx::Cell objects
end
xlsx.each_row_streaming(max_rows: 3) do |row| # Will yield 4 rows (it's automatically incremented by 1) after the supplied offset.
puts row.inspect # Array of Excelx::Cell objects
end
Iterate over each row
xlsx.each_row do |row|
...
end
Roo::Excelx
also provides these helpful methods.
xlsx.excelx_type(3, 'C')
# => :numeric_or_formula
xlsx.cell(3, 'C')
# => 600000383.0
xlsx.excelx_value(row,col)
# => '600000383'
xlsx.formatted_value(row,col)
# => '0600000383'
Roo::Excelx
can access celltype, comments, font information, formulas, hyperlinks and labels.
xlsx.comment(1,1, ods.sheets[-1])
xlsx.font(1,1).bold?
xlsx.formula('A', 2)
Roo::OpenOffice has support for encrypted OpenOffice spreadsheets.
# Load an encrypted OpenOffice Spreadsheet
ods = Roo::OpenOffice.new("myspreadsheet.ods", password: "password")
Roo::OpenOffice
can access celltype, comments, font information, formulas and labels.
ods.celltype
# => :percentage
ods.comment(1,1, ods.sheets[-1])
ods.font(1,1).italic?
# => false
ods.formula('A', 2)
# Load a CSV file
csv = Roo::CSV.new("mycsv.csv")
Because Roo uses the standard CSV library, you can use options available to that library to parse csv files. You can pass options using the csv_options
key.
For instance, you can load tab-delimited files (.tsv
), and you can use a particular encoding when opening the file.
# Load a tab-delimited csv
csv = Roo::CSV.new("mytsv.tsv", csv_options: {col_sep: "\t"})
# Load a csv with an explicit encoding
csv = Roo::CSV.new("mycsv.csv", csv_options: {encoding: Encoding::ISO_8859_1})
You can also open csv files through the Roo::Spreadsheet class (useful if you accept both CSV and Excel types from a user file upload, for example).
# Load a spreadsheet from a file path
# Roo figures out the right parser based on file extension
spreadsheet = Roo::Spreadsheet.open(csv_or_xlsx_file)
# Load a csv and auto-strip the BOM (byte order mark)
# csv files saved from MS Excel typically have the BOM marker at the beginning of the file
spreadsheet = Roo::Spreadsheet.open("mycsv.csv", { csv_options: { encoding: 'bom|utf-8' } })
If you use .xls
or Google spreadsheets, you will need to install roo-xls
or roo-google
to continue using that functionality.
Roo's public methods have stayed relatively consistent between 1.13.x and 2.0.0, but please check the Changelog to better understand the changes made since 1.13.x.
bundle install --with local_development
)git checkout -b my-new-feature
)git commit -am 'My new feature'
)git push origin my-new-feature
)Roo uses Minitest and RSpec. The best of both worlds! Run bundle exec rake
to
run the tests/examples.
You can run the tests/examples with Rspec like reporters by running
USE_REPORTERS=true bundle exec rake
Roo also has a few tests that take a long time (5+ seconds). To run these, use
LONG_RUN=true bundle exec rake
If you find an issue, please create a gist and refer to it in an issue (sample gist). Here are some instructions for creating such a gist.