-
Like collecting councillor emails, collecting a nice headshot for a councillor is quite a manual process. These can be available in a few different ways, such as:
- Available at a URL found on the cou…
-
So when I scrape the index.html page I get relative URLs like `students/diane-vu.html` but `rspec/scraper_spec.rb` expects `http://127.0.0.1:4000/students/diane-vu.html`
-
@jzelinskie and I had a chat about the internal architecture of, especially, the tracker.
They way we handle announces (and scrapes) now is by calling a lot of methods one after another that somehow …
-
Originally reported by: **pmoore (Bitbucket: [pmoore](http://bitbucket.org/pmoore), GitHub: [pmoore](http://github.com/pmoore))**
---
Please can setuptools remove some of the ancient history from th…
ghost updated
8 years ago
-
This is awesome! Do you know of any way to get the game results?
I'm more of a PHP programmer, so I"m going to try to take your results, use the links to get the boxscore page, and extract the data …
-
In order to securely perform direct (from browser) upload/download with S3, Presign() needs to deal with X-Amz-SignedHeaders. Example: I want a pre signed url: https://s3.aws.amazon.com/area51/deadAl…
-
To help the more frequent release schedule, a Python or Bash script should be created which essentially does the following:
1. Switches to the develop branch.
2. Uses 'git describe --abbrev=0' to get …
-
When urllist sometimes appears to give n+1 urls to quickscrape, with the +1 url being a null value (resulting in null url scrapes). This crashes quickscrape, with the info and error messages:
```
in…
-
Most of SECOORA buoys are not found by a csw query in the NGDC catalog. So I am scrapping the thredds URL:
http://129.252.139.124/thredds/catalog_platforms.html
to get a list of SECOORA buoys OPeND…
-
I'm working on a scrapy project where a "rabbit client" and "crawl worker" work together to consume scrape requests from a queue. These requests have more configuration than a start_url - it could be …