Open michaelgira23 opened 7 years ago
If we continue with parsing the bulletin, an anomaly to consider is the bulletin from September 6th, 2018 because of the wacky schedule https://mymicds.net/daily-bulletin/2018-09-06
So, here's the thing. When Ms. O'Brien took over as assistant for Mr. Calise this year, the Daily Bulletin format was changed completely. Now, some things are inserted as images and not everything is displayed as normal text anymore. We might still be able to do some sort of parsing, but it'll certainly be significantly more difficult. Might be something to consider tackling during a long break, but certainly not a priority.
The Daily Bulletin is an email that the entire Upper School receives every day before school. The bulletin is a PDF contains the day's schedule, news from around the school, the lunch, birthdays, and more. On MyMICDS, we query my email and automatically download the Daily Bulletin to put on MyMICDS. While this is useful for people who don't organize their email or don't want to log into their email, it is possible for us to leverage this even more.
It is possible to parse and extract information from the PDF which opens up so many possibilities. Here are a few:
Possibilities
Special Days
You know why there are so many people who where blazers after formal dress day? Because that's their punishment for forgetting. On the header/title of the bulletin, it will usually say whether or not it's formal dress. We can then have an email notification system remind people when it's formal dress the next day. The bulletin also has other holidays besides formal dress.
Parse Announcements
We can get the announcements from the bulletin and add an announcement module in our upcoming modules system. By extracting the text ourselves, we can style the text and integrate it smoothly into our interface and add our own announcements.
Field Trips / Early Dismissal
We could also separate the field trips and early dismissals from the regular announcements. Bonus points if we can highlight which ones are relevant to the user.
Parse Birthdays
Wish our fellow students (and teachers) a happy birthday. Preferably also change their background to this gif I made a long time ago in v1 for Alexander's birthday. However, this works as well.
Lunch
Lunch isn't too terribly important because we already get data from the school lunch website, but it wouldn't hurt to have a point of redundancy to fall back to in case the lunch website is down or something.
Schedule
This is probably the most ambitious out of all of the possibilities, but if we're successful, it could be one of the most useful. If we're able to parse the schedule in the Daily Bulletin, then we can have more redundancy and rely less on the Portal. Currently, if the Portal goes down, then MyMICDS is screwed when it comes to displaying the schedule, which is one of the main features of the site. Also, the bulletin sometimes has a more detailed schedule (usually special assemblies or activities are just labelled "Advisory").
What makes the schedule so complex is when different grades/classes have different things. For example, on Day 1, Science/Art/Math have first lunch and class second and vice versa for other classes. Special schedules are hard in general, and we'd have to parse keywords to determine which demographic each entry belongs to in the schedule.
Clubs
We can find out which clubs are meeting in which room with which teacher. We could compile a list of all the clubs and have users select any they are in. We can add notifications if they're club is meeting, and even insert it into their schedule automatically.
Challenges
While all of these things sound awesome, it ain't easy.
As far as I know, the Daily Bulletin is made manually, by humans. It visually looks similar, but when attempting this task last year in v1, I noticed several nuances (1 line break separating the announcements instead of 2, etc.) Text is messy, so it's going to be hard to parse it. We'd have to expect anything could change, and compensate for it. It will be important to look at all the previous archived bulletins (since my freshman year!) and make sure every single one of them parses correctly.
Solutions
I've attempted to parse the Daily Bulletin back in MyMICDS-v1 (
php/parse_bulletin.php
) but didn't have much success. I used a PDF -> Text converter, which meant I could only work with a string. However, the Daily Bulletin uses different styles like bold/underlined, center alignment, etc. that can't be represented in a simple string. Headers and titles were very hard to distinguish without stylings. That's why I'd recommend using a library likepdf2json
which gives a lot more data to work with.