ruby-docx / docx

a ruby library/gem for interacting with .docx files
MIT License
431 stars 170 forks source link

I'm trying to test a docx download and I hoped I could use this for parsing but I don't understand how #107

Closed crystal-wb closed 3 years ago

crystal-wb commented 3 years ago

In my app we have a download of a docx file that contains filtered and sorted database objects. I want to write tests for the download to makes sure it's filtered and sorted properly. So I install the docx gem to read the file that is downloaded during feature tests. But I'm not really sure how to test its content. Is there a place where I can find more information on reading the document? Or maybe there is a better gem for testing?

I only need a docx reader for tests. I don't want to spend too much time figuring out how to use it if there is something better suited for turning the docx into something readable. i appreciate any guidance.

crystal-wb commented 3 years ago

I used Docx::Document.open on the file and saved the output to a variable. Then I returned the variable and it seemed to be readable. I thought I had needed to do something like File.read. So I was confused. My apologies.

crystal-wb commented 3 years ago

Actually I tried testing it and it does not work. I still need help regarding testing the content of the docx. I'm not sure what I'm supposed to do with the Docx::Document object.

WaKeMaTTa commented 3 years ago

Example using plain ruby:

require 'docx'

# Create a Docx::Document object for our existing docx file
doc = Docx::Document.open('example.docx')

# Checks
if doc.paragraphs.count != 2
  raise "Document doesn't have 2 paragraphs." 
end

if doc.paragraphs[0].text.match?(/^Hello/)
  raise "The 1st paragraph doesn't start with the word 'Hello'." 
end

if doc.paragraphs[1].text == "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
  raise "The 2nd paragraph doesn't contain the expected text." 
end
crystal-wb commented 3 years ago

Example using plain ruby:

require 'docx'

# Create a Docx::Document object for our existing docx file
doc = Docx::Document.open('example.docx')

# Checks
if doc.paragraphs.count != 2
  raise "Document doesn't have 2 paragraphs." 
end

if doc.paragraphs[0].text.match?(/^Hello/)
  raise "The 1st paragraph doesn't start with the word 'Hello'." 
end

if doc.paragraphs[1].text == "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
  raise "The 2nd paragraph doesn't contain the expected text." 
end

Thank you for your so much for your guidance. I will keep this in mind if I end up using ruby-docx in the future. Since I only needed plain text for the tests I decided to switched to the gem doc_ripper. It works very nicely for checking contents.