metanorma / coradoc

Coradoc is the Core AsciiDoc Parser used by Metanorma
MIT License
1 stars 2 forks source link

Update implementation to be able to transform the ISO Simple Template docx #87

Open ronaldtse opened 5 months ago

ronaldtse commented 5 months ago

We need to be able to convert this Word document into Asciidoc.

reverse_adoc has the command w2a which converts docx into Asciidoc, by first converting it into HTML then adoc.

2023-iso-simple-template-rice.docx

hmdne commented 3 weeks ago

@ronaldtse This issue is marked for 1.0.0 milestone. There is also #115 which is not marked for 1.0.0.

Should we attempt to convert that using the current pipeline having w2a? If yes, and if we create a new pipeline tasked with #115, then the work will be wasted, as I understand, the current pipeline is meant to be replaced.

I haven't really evaluated #115 too thoroughly yet, but I will follow up it with a comment in a moment.

hmdne commented 2 weeks ago

@ronaldtse I need your opinion on that matter before I start the implementation

ReesePlews commented 1 week ago

hello @hmdne i am ready to test the conversion with the .docx file provided. since distribution of internal formats by some SDOs is no longer possible, conversion of their official ms-word version is presently the only way to support input to metanorma. i discussed this with @ronaldtse a while back. i suspect in the future, the documents may end up "living" in a different system, and the ms-word output from that system could be rather different that what we presently have. when you are ready, let me know if there are any questions about the sample word file. thank you.