transparentdemocracy / voting-data

Voting behavior data extracted from plenary reports of the Belgian federal government.
5 stars 1 forks source link

Save extracted plenaries and votes intermediately #31

Open sandervh14 opened 1 month ago

sandervh14 commented 1 month ago

We now have these methods in serlialization.py:

def write_markdown():
    plenaries, votes = extract_from_html_plenary_reports()
    MarkdownSerializer().serialize_plenaries(plenaries, votes)

def write_plenaries_json():
    plenaries, votes = extract_from_html_plenary_reports()
    JsonSerializer().serialize_plenaries(plenaries)

def write_votes_json():
    plenaries, votes = extract_from_html_plenary_reports()
    JsonSerializer().serialize_votes(votes)

They are a bit a mix of extraction and serialization, written in the serialization module. This of course serves to be able to call them as commands. It would be nice if we would save plenaries and votes at the end of extract_from_html_plenary_reports() to a folder with intermediate output. This way, serialization.py would only need to read from that intermediate output, and would not have to extract.

Also, if the serialization methods are consecutively called (from main.py), extraction is done three times over. With intermediate output saved, this would only happen once.