pittcsc / PittAPI

An API to easily get data from the University of Pittsburgh
https://pittapi.pittcsc.org
GNU General Public License v2.0
107 stars 32 forks source link

Scrape categories and topics for `news.py` #204

Open tianyizheng02 opened 2 weeks ago

tianyizheng02 commented 2 weeks ago

We should scrape [the categories and topics]. We have no idea what the maintenance of this repo will look like over time. It's certainly had its lulls over time, so let's make it withstand the lack of us

_Originally posted by @RitwikGupta in https://github.com/pittcsc/PittAPI/pull/203#discussion_r1730415533_

In #203, I rewrote news.py to scrape Pitt news articles from the Pittwire website, but I hard-coded the list of news categories and topics. We should scrape these values instead so that we don't have to keep them updated ourselves. Ideally, news.py should only scrape these values once, when the users uses a function from the module for the first time, so that the values are available for all subsequent function calls.

tianyizheng02 commented 2 weeks ago

Opened an issue for this task in case anyone else wanted to work on it

timparenti commented 2 weeks ago

Should this really be on import, or should it be on use?

tianyizheng02 commented 2 weeks ago

Both would technically work, but yeah it'll probably be better if they were imported on use, if for no reason than to make the code easier to test.