Issues found in alfa test of the website

sandervh14 commented 1 month ago

I just tested our website locally. I think these are important changes we need to make to make the website experience better:

don't remove newlines between lines of motion title
description below the motion in the front-end: newlines are gone. For example, we know that 15.04, 15.05, etc: these are always on a new line. Keeping the original text formatting would be great, making the text better readable, not one big lump of text. I don't know if this means the frontend is not using the description_tags_nl that were made at some point, instead of description_nl. Or there's another problem.
improve extraction of Dutch vs French description texts which are displayed below the motions on the website. The original text behind these descriptions contains indications of which tag is French, which is Dutch, although we didn't find it be consistent all the time. Either we continue that path and just do best-effort, or we go to separate French / Dutch texts using the summarization that Karel is working on.
"data freely available with the MIT license" says the bottom of the website, we don't have a license set yet on our data repo. I'm worried about the "modification" permission on the data that MIT license brings. We don't want to open a gigantic door to a convenient source for creating a misinformation website that is copying ours, with modified data.
reduce errors thrown in the backend on start-up, like the following (and many more):

This is what leads to only so few plenaries being listed currently on the website. And therefore also fewer motions on the website than we have extracted already to plenaries.json. To be investigated why the backend is so unhappy about the plenaries.json we deliver.

karel1980 commented 1 month ago

I've changed title parsing so newlines aren't lost

karel1980 commented 1 month ago

also updated frontend to preserve newlines in titles (white-space: pre-wrap)

karel1980 commented 1 month ago

@sandervh14 can you provide examples for bullets 2 & 3 so we know what to focus on exactly?

karel1980 commented 1 month ago

Screenshot for newlines in titles (this is a newline I added myself for testing)

sandervh14 commented 1 month ago

Nice. 🔥

I updated my issue description above (https://github.com/transparentdemocracy/voting-data/issues/50#issue-2321798181).

sandervh14 commented 3 weeks ago

New ones:

[ ] quite a few recent plenaries with "Er waren geen stemmingen", is this correct? [ ] motion on 28/3/324 with number 479 (unsupported number of motions so far) has no document reference and therefore also no summary.

karel1980 commented 3 weeks ago

I've checked plenaries 308, 307, 306, 305, 303 and 301 which didn't have any votes according to the website. Of these, only 301 actually had votes. I'll add a test case for this.

karel1980 commented 3 weeks ago

Extraction for 301 fails because the plenary doesn't open with a level 1 header (h1 or css class) and the extraction algorithm expects this.

transparentdemocracy / voting-data

Issues found in alfa test of the website #50