BendJS / bend-community-site

Community website for all tech sectors in Bend. #BendHacktoberfest
1 stars 16 forks source link

Get Meetup.com info #10

Open kcloud99 opened 4 years ago

kcloud99 commented 4 years ago

Meetup.com closed their API this year unless you pay for a pro account. We need a way to scrape the data from all the tech meetups in town and display it on our calendar.

Possibly also have a preview function (like slack or text preview) for the event webpage

ajciancimino commented 4 years ago

Dropping by to leave some Hacktoberfest cheer (it ain't a PR but it's something and the issue at hand caught my attention.)

Scraping a site like Meetup. com should be fairly easy at it's core. The question is on the complexity of the task is implementation. Namely the location of the meetups and info needed. By default the site appears to navigate to the meetups it thinks are in you local area, so if it's for their that super easy. But, if the info needs to be gotten for a different location (which I presume), probably the easiest way would be url or javascript manipulation. The info should be pretty easy to scrape though, no super crazy page elements to be seen; basic xpaths like this should work.

Meetup Title: //section[@id="techEventsShelf"]//span[contains(@class ,"sectionTitle")] Hosting Group: //section[@id="techEventsShelf"]//p[contains(@class ,"groupName")] Date/Time: //section[@id="techEventsShelf"]//time

Of course if you need meetup descriptions that might be more complex (and add longrer runtime).

I apologize in advance if this is all repeat on stuff you already know, don't mean to redesign the wheel :)

ctsstc commented 4 years ago

Does this need something like a data warehouse where this scrapes once a night and then we have our own DB we can query? Do we run this nightly and rebuild the site (if its a static site) every night if there's a change? We sure do not want to scrape every time someone visits the site; we want minimal strain against the meetup platform, as to not cause any unnecessary attention & for speed.

drifterz28 commented 4 years ago

I have a project going for this https://github.com/drifterz28/meetup-scrape I feel like this does not 100% belong in this repo yet so I am building an API layer in node that we can move to AWS or where ever later.

drifterz28 commented 4 years ago

Does this need something like a data warehouse where this scrapes once a night and then we have our own DB we can query? Do we run this nightly and rebuild the site (if its a static site) every night if there's a change? We sure do not want to scrape every time someone visits the site; we want minimal strain against the meetup platform, as to not cause any unnecessary attention & for speed.

I do not feel we will bring any "strain" to their servers with this but storing the data / pulling once a day would be more than enough.

JacobMGEvans commented 4 years ago

Does this need something like a data warehouse where this scrapes once a night and then we have our own DB we can query? Do we run this nightly and rebuild the site (if its a static site) every night if there's a change? We sure do not want to scrape every time someone visits the site; we want minimal strain against the meetup platform, as to not cause any unnecessary attention & for speed.

They definitely have the bandwidth for a daily pull and you can store it in something simple like Firebase Real-Time Database There are a lot of ways to set it up but this is pretty public API data so it shouldnt be too much hassle. https://firebase.google.com/docs/database