pgh-public-meetings / city-scrapers-pitt

Pittsburgh City Scrapers: sourcing public meetings in Pittsburgh
https://pgh-public-meetings.github.io/events/
MIT License
19 stars 66 forks source link

Spider: Pittsburgh Mayor's Office of Community Affairs #30

Closed bonfirefan closed 4 years ago

bonfirefan commented 5 years ago

Spider Name: pitt_community

Website:

https://nextdoor.com/profile/2376387/ Example: https://nextdoor.com/news_feed/?post=93922079

Scraping Notes:

This one will be very tricky, as it requires authenticating with Nextdoor first. Once authenticated, their events load in json form.

mishugana commented 5 years ago

I'll take this

mishugana commented 5 years ago

I've successfully authenticated, i'll be able to get a list of self-described meetings (contains meeting in title) and their dates, based on the wording of Today, Tonight, or Tomorrow (in combination with the post date) and their descriptions on the nextdoor post. That seems to cover the bulk of the relevant posts being posted.

MisterZW commented 4 years ago

@danwarren and I pulled the old version from @mishugana to try to get it running. It seems like the XPATH to pull out the "csrfmiddlewaretoken" no longer functions correctly (returns None). It is unclear from inspecting NextDoor's login page where such a token would be, and there are no indications that NextDoor has an API or is planning to make one, either.

Anyone looking to work on this would need to find a way to fix the authentication here.

mishugana commented 4 years ago

i can take a look at around 5am EST.

On Sat, Dec 21, 2019, 8:30 AM Zachary Whitney notifications@github.com wrote:

@danwarren https://github.com/danwarren and I pulled the old version from @mishugana https://github.com/mishugana to try to get it running. It seems like the XPATH to pull out the "csrfmiddlewaretoken" no longer functions correctly (returns None). It is unclear from inspecting NextDoor's login page where such a token would be, and there are no indications that NextDoor has an API or is planning to make one, either.

Anyone looking to work on this would need to find a way to fix the authentication here.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bonfirefan/city-scrapers-pitt/issues/30?email_source=notifications&email_token=AA3MHXRGLFP46ZAQQTQU2UDQZVWT5A5CNFSM4GZHZNIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSCTY#issuecomment-568140111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3MHXRH5Z7MTRCHNH2HF3TQZVWT5ANCNFSM4GZHZNIA .

mishugana commented 4 years ago

actually, can you check to see that the login and password are correct? but also, there may be a general setting regarding cookies (that normally does not need to be changed for security), that i should have noted in either comments or github that may need to be enabled or disabled to have that work correctly.

On Sat, Dec 21, 2019, 8:30 AM Zachary Whitney notifications@github.com wrote:

@danwarren https://github.com/danwarren and I pulled the old version from @mishugana https://github.com/mishugana to try to get it running. It seems like the XPATH to pull out the "csrfmiddlewaretoken" no longer functions correctly (returns None). It is unclear from inspecting NextDoor's login page where such a token would be, and there are no indications that NextDoor has an API or is planning to make one, either.

Anyone looking to work on this would need to find a way to fix the authentication here.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bonfirefan/city-scrapers-pitt/issues/30?email_source=notifications&email_token=AA3MHXRGLFP46ZAQQTQU2UDQZVWT5A5CNFSM4GZHZNIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSCTY#issuecomment-568140111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3MHXRH5Z7MTRCHNH2HF3TQZVWT5ANCNFSM4GZHZNIA .

mishugana commented 4 years ago

btw this is mostly fixed if anyone wants to try it

ben-nathanson commented 4 years ago

Closed by #43