ceos-org / ceos-ard

Repository for CEOS Analysis Ready Data (CEOS-ARD), including Product Family Specifications (PFSs)
9 stars 0 forks source link

Word rendering for Markdown #32

Closed m-mohr closed 1 week ago

m-mohr commented 2 weeks ago

First experiments. Generates AR to gh-pages branch.

Generated example document in Word format: https://ceos-org.github.io/ceos-ard/CEOS-ARD_PFS_Aquatic-Reflectance_latest.docx

Couldn't generate PDF automatically due to technical issues, may need to be generated manually due to not having MS Office in the runners.

gamedaygeorge commented 1 week ago

This looks really interesting. Just for my understanding, it seems like this Word doc is generated using the scripting in this PR, being 'linted' using the config file from PR33? I think this deterministic approach would be key to making this a working repo.

libbyrose commented 1 week ago

@m-mohr appears there is a very small conflict in the Specifications/Aquatic-Reflectance/README.md file - could you take a look? I don't want to delete anything in case it ruins it all!

m-mohr commented 1 week ago

@gamedaygeorge Yes, your understanding is correct.

@libbyrose Fixed the merge conflicts.

gamedaygeorge commented 1 week ago

I gave the process a try (with the help of GPT - I'd never done this before), and generated the attached example Word version of the AR PFS.

Using the AR readme.md file, when I tried to run the command directly to Word, it gave me an error.

pandoc README.md -o README.docx
[WARNING] Could not convert TeX math NO_2, rendering as TeX:

I had to go via an HTML intermediary:

pandoc README.md -s --mathjax -o README.html
pandoc README.html -o README.docx

This method seemed to work, and the attached example seems intact.

Given this, I suppose the work flow to update a PFS could be:

Looking down the track, another question: could this approach be used to automate the process in other useful ways? e.g. does this make the PFS more machine readable, opening up the possibility of scripting or tools reading the PFS directly?

CARD4L_Product_Family_Specification_Aquatic_Reflectance-v1.0 - GD TEST VERSION.docx

m-mohr commented 1 week ago

@gamedaygeorge

Using the AR readme.md file, when I tried to run the command directly to Word, it gave me an error.

Yeah, that command was not complete. The CI passes a couple of additional parameters. You don't need to go through HTML. I've started to document the whole process better: #36 I won't add more information on what you should change to run to actually make you my first tester of whether the instructions in that document are complete and understandable. I hope you don't mind it ;-)

Given this, I suppose the work flow to update a PFS could be:

I also started to document this a bit more in #36. This probably needs a bit more fine-tuning and discussions before, but I thought having something in place as a basis for discussions is good. It pretty much reflects what you've written as well. Further feedback is welcome.

Submit a PR with changes on a given specifications readme.md, e.g. /Specifications/Aquatic-Reflectance/README.nd

Yes, although with #35 it's PFS.md instead of README.md

Looking down the track, another question: could this approach be used to automate the process in other useful ways? e.g. does this make the PFS more machine readable, opening up the possibility of scripting or tools reading the PFS directly?

Good question, in principle we could even move away from Markdown and generate this even more automatically from some kind of e.g. YAML files that are a bit more structured. And also we could actually generate them from individual building blocks so that they may only be written once and are re-used in the PFSes. But maybe that's a step for the future...?

PS: Maybe we should move this discussion to #28

gamedaygeorge commented 1 week ago

@m-mohr have tested out #36 steps, and followed up there.

And agree that the broader discussion on repo structure is better suited to #28.