planningalerts-scrapers / knox

Knox Council Development Applications
0 stars 2 forks source link

Broken on morph only? #1

Closed CloCkWeRX closed 7 years ago

CloCkWeRX commented 10 years ago

Running this scraper locally works; but on its as though we are seeing different landing pages.

~/knox$ ruby scraper.rb 
"Found another page - 2"
"Found another page - 3"
"Found another page - 4"
"Found another page - 5"


Any thoughts on how to debug? Fork and wget/render raw HTML out maybe, so we can see if we're being redirected somewhere or something?

henare commented 10 years ago

It could be the library versions? Try running it with morph-cli maybe.

Your debugging ideas sound good too.

CloCkWeRX commented 10 years ago

It's gem versions - here's the page we are getting back

Here's a set of gems avail on my machine that cause it to be happy.

aaronh-chronic (0.3.9)
aasm (3.0.19)
ace-rails-ap (2.0.1)
actionmailer (4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.4, 4.0.3, 4.0.2, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14, 0.6.1)
actionpack (4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.4, 4.0.3, 4.0.2, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14, 1.4.0)
actionview (4.1.4, 4.1.1, 4.1.0)
active_model_serializers (0.9.0.alpha1)
activemodel (4.1.5, 4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.4, 4.0.3, 4.0.2, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14)
activerecord (4.1.5, 4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.4, 4.0.3, 4.0.2, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14, 1.6.0)
activerecord-deprecated_finders (1.0.3, 0.0.3)
activerecord-import (0.5.0)
activerecord-postgis-adapter (2.1.1)
activerecord-postgres-array (0.0.10)
activeresource (4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14)
activesupport (4.1.5, 4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.4, 4.0.3, 4.0.2, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.15, 3.2.14)
addressable (2.3.6, 2.3.5)
afm (0.2.1, 0.2.0)
akami (1.2.2, 1.2.1, 1.2.0)
alchemist (0.1.7)
ancestry (2.0.0)
ansi (1.4.3)
api_cache (0.2.3)
archive-tar-minitar (0.5.2)
arel (, 4.0.2, 4.0.1, 4.0.0, 3.0.3, 3.0.2)
arrayfields (4.7.4)
Ascii85 (1.0.2)
atomic (1.1.16, 1.1.15, 1.1.14)
autoparse (0.3.3)
awesome_print (1.2.0, 1.1.0)
backports (3.6.0, 3.4.0, 3.3.5, 3.3.4, 3.3.3, 3.3.0)
barber (0.4.2)
barber-emblem (0.1.1)
bcrypt (3.1.7)
bcrypt-ruby (3.1.2)
berkshelf (3.1.3, 2.0.10)
berkshelf-api-client (1.2.0)
better_errors (1.1.0)
bigdecimal (1.2.4)
binding_of_caller (0.7.2)
bluff (0.1.0)
bond (0.5.1)
bootstrap-kaminari-views (0.0.3)
bootstrap-sass (
buff-config (1.0.0, 0.4.0)
buff-extensions (1.0.0, 0.5.0)
buff-ignore (1.1.1, 1.1.0)
buff-ruby_engine (0.1.0)
buff-shell_out (0.1.1)
buftok (0.2.0)
builder (3.2.2, 3.1.4, 3.0.4)
bundler (1.6.0)
bundler-audit (0.3.1, 0.3.0)
business_time (0.7.3)
cancan (1.6.10)
cane (2.5.2)
capistrano (2.15.5)
capybara (2.1.0, 2.0.3)
capybara-page-object (0.6.1)
capybara-webkit (1.3.0)
carmen (1.0.0)
carmen-rails (1.0.0)
carrierwave (0.10.0, 0.9.0)
celluloid (0.16.0.pre2, 0.15.2, 0.14.1)
celluloid-io (0.16.0.pre2, 0.15.0, 0.14.1)
chef (11.14.2, 11.14.0.alpha.3, 11.12.8, 11.10.4, 11.2.0)
chef-zero (2.2, 2.1.5, 2.0.2, 1.7.3)
childprocess (0.5.3, 0.5.2, 0.5.1, 0.3.9)
chozo (0.6.1)
chronic (0.10.2, 0.9.1)
churn (0.0.28)
ci_reporter (2.0.0, 1.9.2, 1.9.1, 1.9.0)
ci_reporter_cucumber (1.0.0)
ci_reporter_rspec (1.0.0)
ci_reporter_spinach (1.0.0)
climate_control (0.0.3)
cocaine (0.5.4, 0.5.3)
code_analyzer (0.3.2)
coderay (1.1.0, 1.0.9)
coffee-rails (4.0.1, 3.2.2)
coffee-script (2.3.0, 2.2.0)
coffee-script-source (1.8.0, 1.7.1, 1.7.0, 1.6.3)
colored (1.2)
colorize (0.7.3, 0.7.0, 0.5.8)
columnize (0.8.9, 0.3.6)
commonjs (0.2.7)
connection_pool (2.0.0, 1.2.0, 1.1.0)
cookiejar (0.3.2)
countries (0.9.3)
coveralls (0.7.0)
crack (0.4.2, 0.4.1, 0.3.2)
cucumber (1.3.15, 1.3.5)
cucumber-rails (1.3.1)
currencies (0.4.2)
daemons (1.1.9)
database_cleaner (1.3.0, 1.2.0, 1.1.1, 1.0.1)
dbf (2.0.7)
debug_inspector (0.0.2)
debugger (1.6.8)
debugger-linecache (1.2.0)
debugger-ruby_core_source (1.3.5, 1.2.3)
delayed_job (4.0.1, 3.0.5)
delayed_job_active_record (4.0.1, 0.4.4)
delayed_paperclip (2.8.0)
delorean (2.1.0)
dep-selector-libgecode (1.0.2)
dep_selector (1.0.3)
devise (3.2.4, 1.5.4)
diff-lcs (1.2.5, 1.2.4, 1.1.3)
diffy (3.0.5)
docile (1.1.5, 1.1.3, 1.1.2, 1.1.1)
domain_name (0.5.20)
dotenv (0.11.1)
dotenv-deployment (0.0.2)
dotenv-rails (0.11.1)
elasticsearch (1.0.1)
elasticsearch-api (1.0.1)
elasticsearch-model (0.1.0)
elasticsearch-rails (0.1.0)
elasticsearch-transport (1.0.1)
em-http-request (1.1.2)
em-socksify (0.3.0)
email_spec (1.5.0)
ember-bootstrap-rails (0.0.3)
ember-data-source (1.0.0.beta.9, 1.0.0.beta.4)
ember-rails (0.15.0, 0.14.1)
ember-source (1.7.0, 1.3.0)
emblem-rails (0.2.1)
emblem-source (0.3.11)
equalizer (0.0.9)
erector (0.10.0)
erubis (2.7.0)
ethon (0.7.0)
eventmachine (1.0.3)
exception_notification (4.0.0)
execjs (2.2.1, 2.0.2)
extlib (0.9.16)
factory_girl (4.4.0, 4.3.0, 4.2.0)
faker (1.2.0, 1.1.2)
fancybox2-rails (0.2.8)
faraday (0.9.0, 0.8.9, 0.8.8)
faraday_middleware (0.9.1, 0.9.0)
faraday_middleware-multi_json (0.0.5)
fastercsv (1.5.5)
fattr (2.2.1)
faye-websocket (0.4.7)
ffi (1.9.3, 1.6.0, 1.3.1)
ffi-yajl (1.0.1)
fitgem (0.9.0)
flay (2.0.1)
flog (3.2.2)
font-awesome-rails (
foodcritic (0.2.0)
forecast_io (2.0.0)
foreman (0.63.0)
formatador (0.2.5, 0.2.4)
formtastic (2.2.1)
fuubar (2.0.0, 1.3.3, 1.3.2, 1.1.1)
geokit (1.8.5, 1.8.4)
geokit-rails (2.0.1)
gherkin (2.12.2, 2.12.0)
gherkin-ruby (0.3.2, 0.3.1)
git (1.2.8, 1.2.7, 1.2.6, 1.2.5)
gmail (0.4.0)
gmail_xoauth (0.4.1)
google-api-client (0.7.1, 0.6.4)
grit (2.5.0)
growl (1.0.3)
gssapi (1.0.3)
guard (2.6.1, 2.6.0, 2.5.1, 2.4.0, 2.3.0, 1.8.3, 1.8.1)
guard-bundler (2.0.0, 1.0.0)
guard-rack (1.4.0, 1.3.1, 1.3.0)
guard-rspec (4.2.10, 4.2.8, 4.2.7, 4.2.6, 4.2.5, 3.1.0, 3.0.3, 3.0.2)
gyoku (1.1.1, 1.1.0)
haml (4.0.3)
haml-rails (0.4)
handlebars-source (1.3.0, 1.2.1)
hashery (2.1.1, 2.1.0)
hashie (3.3.1, 3.0.0, 2.1.2, 2.1.1, 2.0.5)
hashr (0.0.22)
headless (1.0.1)
highline (1.6.21, 1.6.19)
hike (1.2.3)
hipchat (1.2.0, 1.1.0)
hirb (0.7.1)
hitimes (1.2.2, 1.2.1)
holidays (1.0.6)
hoptoad_notifier (2.4.11)
hpricot (0.8.6)
htmlentities (4.3.2, 4.3.1)
http (0.5.0)
http-cookie (1.0.2)
http_parser.rb (0.6.0, 0.5.3)
httparty (0.13.1, 0.10.2)
httparty_sober (0.2.1)
httpclient (
httpi (2.2.7, 2.2.5, 2.2.4, 2.1.0, 0.9.7)
i18n (0.6.11, 0.6.9, 0.6.5)
inflection (1.0.0)
ink3-rails (0.1.3)
innertube (1.0.2)
interception (0.5)
io-console (0.4.2)
ipaddress (0.8.0)
jammit (0.6.5)
japgolly-Saikuro (
jbuilder (2.1.3, 1.5.3)
jist (1.5.1)
journey (1.0.4)
jquery-rails (3.1.1, 3.1.0, 3.0.4)
jquery-ui-rails (4.1.1)
json (1.8.1, 1.8.0, 1.7.7)
json_pure (1.8.0)
jsonpath (0.5.6)
jwt (1.0.0, 0.1.11, 0.1.10, 0.1.8)
kaminari (0.16.1, 0.15.1)
kgio (2.8.1)
knife-solo (0.4.2, 0.2.0)
knife-vagrant (0.0.7)
kramdown (1.3.3)
launchy (2.4.2, 2.3.0)
leaflet-rails (0.7.3)
less (2.5.1, 2.4.0, 2.3.2)
less-rails (2.5.0, 2.4.2, 2.3.3)
libnotify (0.8.3, 0.8.2, 0.1.4)
librarian (0.1.2, 0.0.26)
librarian-chef (0.0.2)
libv8 ( x86_64-linux)
libyajl2 (1.0.1)
liquid (2.6.1)
listen (2.7.8, 2.7.1, 2.6.1, 2.5.0, 2.4.0, 1.3.1, 1.2.2)
little-plugger (1.1.3)
log4r (1.1.10)
logging (1.8.1)
lumberjack (1.0.7, 1.0.5, 1.0.4)
macaddr (1.7.1)
machinist (2.0)
magic_encoding (0.0.2)
mail (2.5.4)
main (5.2.0)
map (6.2.0)
mechanize (2.7.3)
memoizable (0.4.2)
method_source (0.8.2, 0.8.1)
metric_fu (4.2.1)
metric_fu-roodi (2.2.1)
mime (0.4.2)
mime-types (2.3, 1.25.1, 1.25, 1.23)
mini_portile (0.6.0, 0.5.3, 0.5.2, 0.5.1)
minitar (0.5.4)
minitest (5.4.1, 5.4.0, 5.3.5, 5.3.4, 5.3.3, 4.7.5)
mixlib-authentication (1.3.0)
mixlib-cli (1.5.0, 1.3.0)
mixlib-config (2.1.0, 2.0.0)
mixlib-log (1.6.0)
mixlib-shellout (1.4.0, 1.2.0)
momentjs-rails (2.4.0)
moneta (0.6.0)
morph-cli (0.2.1)
mqtt (0.2.0)
multi_json (1.10.1, 1.10.0, 1.9.3, 1.9.2, 1.9.0, 1.8.4, 1.8.2, 1.8.1, 1.7.9, 1.7.8)
multi_test (0.1.1, 0.0.2)
multi_xml (0.5.5, 0.5.3)
multipart-post (2.0.0, 1.2.0)
mysql (2.8.1)
mysql2 (0.3.16, 0.3.15, 0.3.13)
naught (1.0.0)
net-http-digest_auth (1.4)
net-http-persistent (2.9.4, 2.9)
net-scp (1.1.2, 1.0.4)
net-sftp (2.1.2, 2.0.5)
net-ssh (2.9.1, 2.7.0, 2.6.8)
net-ssh-gateway (1.2.0)
net-ssh-multi (1.2.0, 1.1)
newrelic_rpm (,,,,,,,,
nike_v2 (0.3.4)
nio4r (1.0.0, 0.5.0)
nokogiri (,, 1.6.1, 1.6.0, 1.5.11, 1.5.10, 1.5.6)
nori (2.4.0, 2.3.0, 1.1.5)
ntlm-http (0.1.1)
oauth (0.4.7, 0.4.5)
oauth2 (1.0.0, 0.9.3)
octokit (3.2.0)
ohai (7.2.0, 7.2.0.alpha.0, 7.0.4, 6.18.0, 6.16.0)
omniauth (1.2.2, 1.2.1)
omniauth-google-oauth2 (0.2.5, 0.2.2)
omniauth-oauth2 (1.2.0, 1.1.2)
orm_adapter (0.5.0, 0.0.7)
paperclip (4.2.0, 4.1.1, 3.5.0)
parallel (0.6.2)
pdf-core (0.2.5, 0.1.6)
pdf-inspector (1.1.0)
pdf-reader (1.3.3)
pg (0.17.1, 0.17.0, 0.16.0)
pg_search (0.7.4, 0.7.3, 0.7.2, 0.7.0)
phantomjs (
pickle (0.4.11)
poltergeist (1.1.2)
polyglot (0.3.5, 0.3.4, 0.3.3)
posix-spawn (0.3.8)
prawn (0.12.0)
prawn-svg (,
progressbar (0.20.0)
protected_attributes (1.0.7)
pry (0.10.1, 0.10.0,,,, 0.9.12)
pry-debugger (0.2.3)
pry-developer_tools (0.1.1)
pry-doc (0.6.0)
pry-docmore (0.1.1)
pry-editline (1.1.2)
pry-full (2.1.0)
pry-git (0.2.3)
pry-highlight (0.0.1)
pry-nav (0.2.4, 0.2.3, 0.2.2)
pry-plus (1.0.0)
pry-pretty-numeric (0.1.1)
pry-rails (0.3.2)
pry-rescue (1.4.1)
pry-stack_explorer (
pry-syntax-hacks (0.0.6)
pry-theme (1.1.2)
psych (2.0.5, 2.0.4, 2.0.3, 2.0.2, 2.0.0)
puma (1.6.3)
qunit-rails (0.0.4)
rabl (0.11.0, 0.10.1, 0.9.3, 0.8.6)
racc (1.4.10)
rack (1.5.2, 1.4.5)
rack-cache (1.2)
rack-protection (1.5.3, 1.5.2, 1.5.1, 1.5.0)
rack-ssl (1.3.4, 1.3.3)
rack-test (0.6.2)
racksh (1.0.0)
rails (4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.3, 4.0.0, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14, 0.9.5)
rails-observers (0.1.2, 0.1.1)
rails_12factor (0.0.2)
rails_best_practices (1.13.2)
rails_serve_static_assets (0.0.2, 0.0.1)
rails_stdout_logging (0.0.3)
railties (4.1.4, 4.1.1, 4.1.0, 4.0.9, 4.0.5, 4.0.3, 4.0.0, 4.0.0.beta1, 3.2.19, 3.2.18, 3.2.17, 3.2.16, 3.2.14)
raindrops (0.12.0)
rake (10.3.2, 10.3.1, 10.2.2, 10.1.1, 10.1.0, 0.9.2)
rb-fsevent (0.9.4, 0.9.3)
rb-inotify (0.9.5, 0.9.3, 0.9.2, 0.9.0)
rb-kqueue (0.2.0)
rbzip2 (0.2.0)
rdoc (4.1.1, 4.1.0, 3.12.2)
redcard (1.1.0)
redis (3.1.0, 3.0.7, 3.0.6, 3.0.4)
redis-namespace (1.5.1, 1.5.0, 1.4.1, 1.3.1)
reek (1.3.1)
ref (1.0.5)
rest-client (1.6.8, 1.6.7)
retriable (1.4.1)
retryable (1.3.5, 1.3.3)
rgeo (0.3.20)
rgeo-activerecord (1.1.0, 0.5.0)
rgeo-geojson (0.3.1, 0.2.3)
rgeo-shapefile (0.2.3)
ri_cal (0.8.8)
riddle (1.5.7)
ridley (4.0.0, 1.5.3)
rmagick (2.13.2)
rr (1.1.2)
rspec (3.0.0, 2.99.0, 2.14.1, 2.9.0)
rspec-collection_matchers (1.0.0)
rspec-core (3.0.4, 3.0.3, 3.0.2, 2.99.2, 2.99.1, 2.14.8, 2.14.7, 2.14.4, 2.13.1, 2.9.0)
rspec-expectations (3.0.4, 3.0.3, 3.0.2, 2.99.2, 2.14.5, 2.14.4, 2.14.1, 2.14.0, 2.9.1)
rspec-instafail (0.2.4)
rspec-mocks (3.0.4, 3.0.3, 3.0.2, 2.99.2, 2.14.6, 2.14.4, 2.14.3, 2.14.2, 2.9.0)
rspec-rails (3.0.2, 2.99.0, 2.14.2, 2.14.0, 2.9.0)
rspec-support (3.0.4, 3.0.3, 3.0.2, 3.0.1)
rspec_junit_formatter (0.2.0, 0.1.6)
rturk (2.12.1)
ruby-growl (4.1)
ruby-prof (0.15.1, 0.14.2)
ruby-progressbar (1.5.1, 1.4.2, 1.4.1, 1.3.2, 1.1.1)
ruby-rc4 (0.1.5)
ruby-shadow (2.3.4)
ruby2ruby (2.0.6)
ruby_gntp (0.3.4)
ruby_parser (3.1.3)
rubyntlm (0.3.4, 0.1.1)
rubyzip (1.1.6, 1.1.2, 1.1.0)
rufus-scheduler (3.0.7)
safe_cookies (0.2.1)
safe_yaml (1.0.3, 1.0.1, 0.9.7, 0.9.5, 0.9.4)
sass (3.3.14, 3.3.13, 3.2.19, 3.2.12, 3.2.9)
sass-rails (4.0.3, 3.2.6)
savon (2.7.2, 2.6.0, 2.5.1, 2.4.0, 2.3.3, 2.3.2, 0.9.5, 0.9.2)
sawyer (0.5.4)
scraperwiki (3.0.2)
sdoc (0.4.1, 0.4.0)
seed-fu (2.3.0)
select2-rails (3.5.7, 3.5.4)
selenium-webdriver (2.42.0, 2.41.0, 2.40.0, 2.38.0)
semverse (1.1.0)
settingslogic (2.0.9)
sexp_processor (4.2.1)
sham (1.1.0)
shotgun (0.9)
shoulda (3.5.0)
shoulda-context (1.1.4)
shoulda-matchers (2.6.2, 2.6.1, 2.6.0, 2.5.0, 2.4.0, 2.3.0, 2.2.0)
sidekiq (3.2.2, 3.2.1, 3.0.0, 2.17.7, 2.17.5, 2.17.0, 2.14.0)
sidekiq-failures (0.4.3, 0.3.0)
signet (0.5.0, 0.4.5)
simple_form (3.0.2)
simple_oauth (0.2.0)
simplecov (0.9.0, 0.8.2, 0.7.1)
simplecov-html (0.8.0, 0.7.1)
sinatra (1.4.5, 1.4.4, 1.4.3, 1.4.2)
sinatra-contrib (1.4.2, 1.4.1, 1.4.0)
sinatra-reloader (1.0)
slack-notifier (0.5.0)
slim (2.0.3, 2.0.2, 2.0.1)
slim-rails (2.0.4)
slop (3.6.0, 3.5.0, 3.4.7, 3.4.6, 3.4.5, 3.4.4)
solve (1.2.0, 0.8.1)
spinach (0.8.10, 0.8.7)
spinach-rails (0.2.1)
spoon (0.0.4)
spork (0.9.2)
spring (1.1.3)
sprockets (2.12.1, 2.11.0, 2.10.1, 2.2.2)
sprockets-rails (2.1.3, 2.0.1, 2.0.0)
sqlite3 (1.3.9, 1.3.8, 1.3.7)
sqlite_magic (0.0.3)
state_machine (1.2.0)
steak (2.0.0)
systemu (2.6.4, 2.5.2)
teaspoon (0.7.7)
temple (0.6.8, 0.6.7, 0.6.6)
term-ansicolor (1.3.0)
test-unit (
therubyracer (0.12.1, 0.12.0)
thin (1.6.2, 1.6.1, 1.5.1, 1.5.0)
thinking-sphinx (2.1.0)
thor (0.19.1, 0.18.1)
thread_safe (0.3.4, 0.3.3, 0.3.1, 0.2.0, 0.1.3)
tilt (1.4.1)
timecop (0.7.1, 0.7.0,, 0.6.1)
timers (3.0.1, 1.1.0)
tins (1.1.0)
tire (0.6.2, 0.6.1, 0.6.0)
titleize (1.3.0)
treetop (1.4.15)
ttfunk (1.1.1, 1.1.0, 1.0.3)
turbolinks (2.3.0, 2.2.2)
twilio-ruby (3.11.5)
twitter (5.8.0)
twitter-bootstrap-rails (2.2.8, 2.2.7)
typhoeus (0.6.8)
tzinfo (1.2.2, 1.2.1, 1.1.0, 0.3.41, 0.3.40, 0.3.39, 0.3.38)
uglifier (2.5.3, 2.5.1, 2.5.0, 2.3.3, 2.2.1)
unf (0.1.4)
unf_ext (0.0.6)
unicode_utils (1.4.0)
unicorn (4.7.0)
uuid (2.3.7)
uuidtools (2.1.4)
validates_existence (0.8.0)
varia_model (0.4.0, 0.2.0)
vcr (2.9.2, 2.9.0, 2.8.0, 2.5.0)
warden (1.2.3)
wasabi (3.3.1, 3.3.0, 3.2.3, 3.2.2, 1.0.0)
webmock (1.18.0, 1.17.4, 1.17.3, 1.17.1, 1.16.1, 1.16.0, 1.13.0)
webrobots (0.1.1)
websocket (1.0.7)
weibo_2 (0.1.6)
whenever (0.6.8)
will_paginate (3.0.5, 3.0.4)
winrm (1.1.3)
wmi-lite (1.0.0)
wunderground (1.2.0)
xeroizer (2.15.5)
xml-simple (1.1.2)
xmpp4r (0.5.6)
xpath (2.0.0, 1.0.0)
yajl-ruby (1.2.1, 1.2.0)
yard (, 0.8.1)
yell (2.0.4, 2.0.3, 2.0.2, 2.0.1, 1.5.1, 1.4.0)
yui-compressor (0.11.0)
zendesk_api (1.3.8)
LoveMyData commented 7 years ago

Looks like it is collecting data ok for the past 3 years, closing