openpreserve / pagelyzer

Suite of tools for detecting changes in web pages and their rendering
http://openplanets.github.io/pagelyzer
Apache License 2.0
53 stars 21 forks source link

Same score obtained for 1800 pairs : 0.37208013647619376 #4

Closed crawler-IM closed 11 years ago

crawler-IM commented 11 years ago

Hi,

We installed successfully the debian package of pagelyzer, and following the solution described in : https://github.com/openplanets/pagelyzer/issues/3 for the systems without GUI, we tried to run pagelyzer on 1800 pairs of links using a script, but we obtained the same result at the end of each comparaison which is : "0.37208013647619376", we checked some pairs (of snapshots) and noted differences between images (and so the score should be different each time).

1- What is the meaning of that result (the number "0.37208013647619376")? Is it the result of wrong marcalyzer's input(nonexistent pictures)?

We noted also : "Waiting page to finish loading... (Timeout in 10sec)" 2- How can we check if the page has finished loading before take the snapshots?

Regards.

asanoja commented 11 years ago

Hello,

That can be possible under three conditions:

We can evaluate this, but either way the new version of pagelyzer has no dependencies like this and it doesn't depends on neither architecture nor python. The new version will be available very soon, I suggest, once uploaded, use the new one.

Regards,

asanoja commented 11 years ago

In reference to the messages in the capture process "Waiting page to finish loading... (Timeout in 10sec)", is a message that appears when capturing the page. We try to get the page data in maximum 10 seconds. Following your remark we have remove the timeout part of this message, and show it when a possible capture problem arise.

We know when a page has been finished loading when the message "done." is shown. Conversely, it shows an error message.

Best,

crawler-IM commented 11 years ago

Hi,

Thanks for the details about the meaning of messages. Can you provide a debian package (.deb), such as : http://deb.openplanetsfoundation.org/pool/main/p/pagelyzer-ruby/pagelyzer-ruby1.9.1_0.9-12-gbbcc12f_amd64.deb

Because we encountered problems with the manual installation (described in the README.md) : https://github.com/openplanets/pagelyzer/issues/1

asanoja commented 11 years ago

If you can point the problems, perhaps I can give direct help into the installation process

best,

On Fri, Apr 12, 2013 at 3:07 PM, crawler-IM notifications@github.comwrote:

Hi,

Thanks for the details about the meaning of messages. Can you provide a debian package (.deb), such as :

http://deb.openplanetsfoundation.org/pool/main/p/pagelyzer-ruby/pagelyzer-ruby1.9.1_0.9-12-gbbcc12f_amd64.deb

Because we encountered problems with the manual installation (described in the README.md)

1 https://github.com/openplanets/pagelyzer/issues/1

— Reply to this email directly or view it on GitHubhttps://github.com/openplanets/pagelyzer/issues/4#issuecomment-16291723 .

Andres Sanoja

LIP6 - UPMC. Paris. France andres.sanoja@lip6.fr Portable: 06 11 63 40 01

4 place Jussieu. 75005

Assistant Professor. Paralel and Distributed Computing Center. Computing Department. Computing School. Central University of Venezuela. Caracas, Venezuela Phone: + 58 212 6051323 Fax: + 58 81 6051134 E-mail1: afsanoja@gmail.com E-mail2: andres.sanoja@ciens.ucv.ve http://ccpd.ciens.ucv.ve/~asanoja

If you need to print this email or any attachments, reuse and recycle the paper

crawler-IM commented 11 years ago

Hi,

I am talking about errors generated when trying to run the tool, as described in the first issue https://github.com/openplanets/pagelyzer/issues/1 : Example of use ++++++++++++++++++++++++++++++++++++++++++++++ $ ./pagelyzer capture --url=http://google.fr internal:lib/rubygems/custom_require:29:in require': no such file to load -- selenium-webdriver (LoadError) from internal:lib/rubygems/custom_require:29:inrequire' from /home/hatem/Bureau/travail/AD-658/pagelyzer-ruby-0.9-standalone/bin/pagelyzer_capture:36:in `'

++++++++++++++++++++++++++++++++++++++++++++++

asanoja commented 11 years ago

We are trying to reproduce this error, and about the debian package will be out soon

asanoja commented 11 years ago

Solved in the new version. Closing this issue.