igumnoff / shiva

Shiva library: Implementation in Rust of a parser and generator for documents of any type
https://docs.rs/shiva
GNU General Public License v3.0
161 stars 11 forks source link

Docx parser #20

Closed evgenyigumnov closed 4 months ago

evgenyigumnov commented 4 months ago
  1. Implement "docx parser" (parse function)
    pub trait TransformerTrait {
    fn parse(document: &Bytes, images: &HashMap<String, Bytes>) -> anyhow::Result<Document>;
    fn generate(document: &Document) -> anyhow::Result<(Bytes, HashMap<String, Bytes>)>;
    }
  2. Units test requered for this new feature

recommended crates:

zip = "0.6"
quick_xml = "0.31.0"

Docx format is a zip file with this file structure

image

all content concentrated in document.xml file

image

exmaple docx file: shiva\lib\test\data\document.docx

image