mle-infrastructure / mle-toolbox

Lightweight Tool to Manage Distributed ML Experiments 🛠
https://mle-infrastructure.github.io/mle_toolbox/toolbox/
MIT License
3 stars 1 forks source link

Markdown report generation #6

Closed RobertTLange closed 3 years ago

RobertTLange commented 3 years ago

Add a first version of md generator which is then turned into .pdf. For example like this:

import pdfkit
import markdown2
from markdowngenerator.markdowngenerator import MarkdownGenerator

db, all_experiment_ids, last_experiment_id = load_experiment_db()
def generate_experiment_report(db, e_id):
    report_data = db.get(e_id)
    with MarkdownGenerator(
        filename=e_id + ".md", enable_write=False
    ) as doc:
        doc.addHeader(1, "Experiment Protocol: " + report_data["project_name"] + " - " + e_id)

        # Meta-Data of the Experiment
        doc.addHeader(2, "Experiment Meta-Data.") 
        doc.writeTextLine(f'{doc.addBoldedText("Purpose:")} ' + report_data["purpose"])

        # Hyperparameters used in the Experiment
        doc.addHeader(2, "Hyperparameters.")
        table = [
            {"Parameter": "col1row1", "Value": "col2row1"},
            {"Parameter": "col1row2", "Value": "col2row2"}
        ]
        doc.addTable(dictionary_list=table)

        # Generated header for figures of the Experiment
        doc.addHeader(2, "Generated Figures.")

# Generate the .md report file
e_id = "e-id-1"
generate_experiment_report(db, e_id)

# Convert the .md into a renderable HTML file
with open(e_id + ".html", 'w') as output_file:
    text = open(e_id + ".md").read()
    html_text = markdown2.markdown(text, extras=["tables"])
    html_text += '<img src="/Users/rtl/Desktop/django_db_overview.png" width="30%" style="margin-right:20px">'
    html_text += '<img src="/Users/rtl/Desktop/django_db_overview.png" width="30%" style="margin-right:20px">'
    html_text += '<img src="/Users/rtl/Desktop/django_db_overview.png" width="30%" style="margin-right:20px">'
    output_file.write(html_text)

# Generate a pdf file for the report
output_filename = 'test_report.pdf'
pdfkit.from_string(html_text, output_filename,
                   options={"enable-local-file-access": None,
                            'page-size': 'A4',
                            'dpi': 400,
                            'print-media-type': '',
                            'disable-smart-shrinking': ''})

Key Q: Is there any easier way with less dependencies that is more stable.

Edit: It may also make sense to replace pdfkit with pyfpdf. Checkout this blog to get started with pyfpdf: https://towardsdatascience.com/how-to-create-pdf-reports-with-python-the-essential-guide-c08dd3ebf2ee

Only problem: You don't get a nice and editable .md file. Although it appears possible to work around this via .html. Maybe I just need to write my own markdown generator?!

RobertTLange commented 3 years ago

Addressed in PR #23