Closed ronaldtse closed 8 months ago
extract all the text stuff (text, tables) into AsciiDoc using reverse_adoc.
reverse_adoc takes like forever to finish converting this document. I couldn't get an output.
So strange though... maybe @HassanAkbar can have a look at reverse_adoc?
Probably due to the length of the document (more than 800 pages). @anermina, could you try convert this document using reverse_adoc to confirm this behavior?
@manuelfuenmayor I’ve started trying it, so @anermina there’s no need to try. Thanks!!
Some challenges with this document:
@manuelfuenmayor reverse_adoc worked on my computer but don't know how long it took. I've pushed it but it's a 20MB file because the images are all inlined. The first thing we have to do is to extract the images into separate files and there are a lot of them. Maybe some 'grep' command would be able to extract all the images... unless we update the reverse_adoc gem to export images individually.
Thanks @ronaldtse. After removing the images, I was able to get an output from reverse_adoc.
I've extracted the images using grep
(along with base64
), as you suggested.
There is this case of text format:
It seems like a case of sub-section more than a numbered list.
I agree. Let’s make it into a subsection.
@manuelfuenmayor is this work completed?
@ronaldtse this work is really far from being completed.
This document has more than 800 not-so-simple tables (with images embedded) that I have to correct manually because reverse_adoc doesn't encode them 100% correct (it is not its fault, this document's HTML is complex).
I've been tweaking the reverse_adoc code to see if I can ease the workload a little. I've done a couple of things already.
I estimate a delivery time of one week or two.
@manuelfuenmayor then maybe we should really fix up reverse_adoc to make it work…
Document encoded in https://github.com/metanorma/mn-samples-mlit/pull/2
This work is done under the MLIT Plateau project.
The "Handbook of 3D City Models: Standard Data Product Specification for 3D City Model" is seemingly published in the Metanorma HTML format, however, a closer look reveals it is created using Nuxt but just looking like Metanorma!
A new flavor will be developed for MLIT / Plateau, so the encoding syntax will be subject to change.
We will need to do the following:
reverse_adoc
.Font: It also uses the "Tokyo CityFont Cond StdN M" ("Tokyo CityFont Condensed M"): TokyoCityFontCondStdN-R.1c4f41e.otf.zip
The font page is here: https://typeproject.com/en/fonts/tokyocityfont . This is clearly a paid font, and the document only comes with the "Regular" style. So we need to create a private Fontist repository for this font.