pzaich / doc_ripper

Parse text contents from common file formats
MIT License
82 stars 18 forks source link

Docx encode problems #8

Open Rogerio opened 7 years ago

Rogerio commented 7 years ago

I'm ripping a file but the output is with encode problems. How should I pass the encode to the ripper?

My output is something like:

Um cronograma que você terá em mãos para acompanhá-lo

My code is: text = DocRipper::rip('file_name.docx') puts text

pzaich commented 7 years ago

@Rogerio Right now the library doesn't really handle encoding issues. I'd like to fix this though and provide encoding as an option. Could you provide a small .docx example fixture that I could use for testing purposes?