pgh-public-meetings / city-scrapers-pitt

Pittsburgh City Scrapers: sourcing public meetings in Pittsburgh
https://pgh-public-meetings.github.io/events/
MIT License
19 stars 66 forks source link

Improve Tools Documentation #104

Open ben-nathanson opened 4 years ago

ben-nathanson commented 4 years ago

There are a number of handy tools we should document more well:

Scrapy shell, selectors, etc.:

Regular Expressions:

Figuring out how a website changes over time:

Reading Spider Output


IDEs

Atom

Feel free to comment below with your favorite tools and I will add them!

maxachis commented 3 years ago

I'm happy to contribute to this, because I looooove documentation. Although I think other people can also contribute to this.

maxachis commented 3 years ago

@bonfirefan @ben-nathanson I think for most or all of these, it makes sense to add another documentation page dedicated to "Tips and Tricks", or "Building Better Spiders", or whatever. The "Development" page has a lot of material as is, and this doesn't really fit into "Troubleshooting". I'll start by putting together a separate .md page with that in mind.

maxachis commented 3 years ago

Assuming my most recent commits are accepted, the next question would be what else requires bolstering?

maxachis commented 3 years ago

We may want to add tools for how to do spiders on PDF files, because as far as I'm aware we don't yet have much in the way of documentation for that.